Methods and compositions for risk prediction, diagnosis, prognosis, and treatment of pulmonary disorders

ABSTRACT

The invention provides diagnostic and therapeutic targets for pulmonary disease, in particular, fibrotic lung disease. The inventors have found that a genetic variant MUC5B gene is associated with increased expression of the gene, increased risk of developing a pulmonary disease, and an improved prognosis and survival among those developing the pulmonary disease.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No.61/298,473, filed Jan. 26, 2010, U.S. Provisional Application No.61/298,814, filed Jan. 27, 2010, U.S. Provisional Application No.61/323,238, filed Apr. 12, 2010, and U.S. Provisional Application No.61/323,760, filed Apr. 13, 2010, the disclosures of which areincorporated herein in their entireties.

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSOREDRESEARCH AND DEVELOPMENT

The present invention was supported at least in part by governmentfunding from the NIH Intramural Research Program of the National Inst.of Environmental Health Sciences (Grant No. Z01-ES101947) and theNational Heart, Lung, and Blood Inst. (Grant Nos. U01-HL067467,R01-HL095393, R01-HL097163, P01-HL092870, and RC2-HL101715). Thegovernment has certain rights in the invention.

BACKGROUND OF THE INVENTION

Pulmonary fibrosis disorders are a growing concern in human andnon-human populations. Pulmonary fibrosis is associated with a number ofcomplex disorders (e.g., Herman-Pudlak Syndrome, tuberous sclerosis,neurofibromatosis, and dyskeratosis congenital). Idiopathic interstitialpneumonia (IIP) represents a class of chronic pulmonary fibroticdisorder characterized by progressive scarring of the alveolarinterstitium leading to severe dyspnea, hypoxemia, and death. Idiopathicpulmonary fibrosis (IPF) is the most common type of IIP and currentlyhas the highest mortality. Despite being an area of intensive research,the etiology of IPF is largely unknown. Familial clustering of IPF anddifferential susceptibility of individuals to fibrogenic dusts hasimplicated genetics in the development of this disorder. Geneticvariants in the telomerase reverse transcriptase (TERT), surfactantprotein A1, and surfactant protein C genes have been implicated indevelopment of familial interstitial pneumonia (FIP). However, thesemutations only account for a small percentage of FIP cases. Familialassociation with IPF is 5-20%, and inheritance appears to be autosomal.The efficacy of current treatments, such as fibrogenic agents, isvariable, indicating a need for more individualized treatment.

Mucins represent a family of glycoproteins associated with mucosalepithelia. Mucins can be associated with the cell membrane or secreted,and typically form a component of mucus. Abnormal expression ormutations in these proteins have been associated with adenocarcinomas,as well as pulmonary disorders such as asthma and bronchitis.

The present inventors have found that genetic variants of the MUC5B geneare associated with pulmonary disease, and can provide a useful tool forprognosing the course of disease and determining a course of treatment.In addition, the increased level of MUC5B expression that results fromthe disclosed genetic variants provides a novel therapeutic target forpulmonary diseases such as IIP, IPF, and FIP.

BRIEF SUMMARY OF THE INVENTION

Accordingly, in some embodiments, the invention provides methods andcompositions for diagnosis, risk prediction, and determining the courseof pulmonary disease. The invention further provides personalizedmethods of treatment for pulmonary diseases.

In some embodiments, the invention provides methods of determiningwhether a subject has or is at risk of developing a pulmonary disease,said method comprising determining (detecting) whether a subjectexpresses an elevated MUC5B RNA level or an elevated MUC5B protein levelrelative to a standard (e.g., normal) control, wherein the presence ofsaid elevated MUC5B RNA level or said elevated MUC5B protein levelindicates said subject has or is at risk of developing a pulmonarydisease. In some embodiments, the pulmonary disease is an interstitiallung disease, e.g., a fibrotic interstitial lung disease, such asidiopathic pulmonary fibrosis or familial interstitial pneumonia.

The level of MUC5B RNA or protein can be determined using an in vitroassay or in vivo imaging assay. In some embodiments, said elevated MUC5Bprotein level or said elevated MUC5B RNA level is determined from abiological sample from the subject, e.g., a pulmonary tissue or bodilyfluid of said subject. The bodily fluid can be, e.g., whole blood,plasma, serum, urine, sputum, saliva, a bronchoalveolar lavage sample,or exhaled breath condensate. In some embodiments, the sample is furtherprocessed, e.g., to separate cellular components or subcellularcomponents. For example, the determining can further comprisesseparating cells from the remaining sample, or isolating exosomes orsubcellular vesicles.

In some embodiments, the method further comprises administering atreatment to the subject, e.g., a pulmonary disease treatment, orinterstitial lung disease treatment. In some embodiments, the treatmentis a mucolytic agent. In some embodiments, the treatment is a MUC5Bantagonist. In some embodiments, the method further comprisesdetermining a second MUC5B RNA level or MUC5B protein level afteradministering said treatment and comparing said second level to thelevel observed before administering said treatment.

In some embodiments, the expression level of at least one additionalpulmonary disease marker is determined and compared to a standardcontrol. For example, the at least one additional pulmonary diseasemarker can be selected from the group consisting of Surfactant ProteinA, Surfactant Protein D, KL-6/MUC1, CC16, CK-19, Ca 19-9, SLX, MCP-1,MIP-1a, ITAC, glutathione, type III procollagen peptide, sIL-2R, ACE,neopterin, beta-glucuronidase, LDH, CCL-18, CCL-2, CXCL12, MMP7, andosteopontin. An aberrant expression level of the pulmonary diseasemarker indicates that the subject has or is at risk of developing apulmonary disease. In some embodiments, the aberrant expression iselevated relative to a normal control. In some embodiments, the aberrantexpression is reduced relative to a normal control. In some embodiments,the method comprises determining whether the genome of the subjectcomprises a genetic variant of the at least one additional pulmonarydisease marker selected from the group consisting of Surfactant ProteinA2, Surfactant Protein B, Surfactant Protein C, TERC, TERT, IL-1RN,IL-1α, IL-1β, TNF, Lymphotoxin α, TNF-RII, IL-10, IL-6, IL-12, IFNγ,TGFβ, CR1, ACE, IL-8, CXCR1, CXCR2, MUC1 (KL6), and MUC5AC, wherein thepresence of a genetic variant of the at least one additional pulmonarydisease marker is indicative that the subject has or is at risk ofdeveloping a pulmonary disease. In some embodiments, the method does notcomprise determining whether the genome of the subject comprises agenetic variant of MUC5AC.

In some embodiments, the standard control is obtained from normal,non-diseased sample. In some embodiments, the standard control is from adifferent individual or pool of individuals. In some embodiments, thestandard control is a standard obtained from a population of individualsthat do not have a pulmonary disease. In some embodiments, the standardcontrol is obtained from the same individual, e.g., obtained at adifferent time, e.g., prior to exposure to an airway stressor.Typically, when detecting or determining the expression level of a givenRNA or protein (e.g., MUC5B), the same RNA or protein is detected in thestandard control. However, in some embodiments, a different RNA orprotein can be detected and the ratio used to determine whether the RNAor protein level from the subject is elevated. Moreover, in someembodiments, the method can comprise comparison to a positive control,e.g., from a known pulmonary disease sample, or a sample from a knownindividual or pool of individuals that carry a genetic variant MUC5Bgene or have elevated MUC5B expression.

In some embodiments, the invention provides methods of determiningwhether a subject has or is at risk of developing a pulmonary disease,said method comprising detecting (determining) whether a genome of asubject comprises a genetic variant MUC5B gene, wherein the presence ofsaid genetic variant MUC5B gene indicates said subject has or is at riskof developing a pulmonary disease. In some embodiments, the pulmonarydisease is an interstitial lung disease, e.g., a fibrotic interstitiallung disease, such as idiopathic pulmonary fibrosis or familialinterstitial pneumonia.

In some embodiments, the genetic variant MUC5B gene in said subjectresults in elevated expression of MUC5B RNA or MUC5B protein. In someembodiments, the subject is homozygous for said genetic variant MUC5Bgene. In some embodiments, the subject is heterozygous for said geneticvariant MUC5B gene. In some embodiments, the subject lacks the geneticvariant MUC5B gene. In some embodiments, the genetic variant MUC5B geneis a genetic variant regulatory region MUC5B gene, e.g., a geneticvariant promoter MUC5B gene. In some embodiments, the genetic variantMUC5B gene has a single nucleotide polymorphism (SNP). In someembodiments, the SNP is selected from the group consisting of singlenucleotide polymorphism is rs2672792, rs72636989, MUC5B-Prm1, rs2672794,rs35705950, MUC5B-Prm2, rs11042491, rs2735726, rs868902, MUC5B-Prm3,MUC5B-Prm4, MUC5B-Prm5, rs868903, MUC5B-Prm6, rs885455, rs885454,MUC5B-Prm7, rs7115457, rs7118568 rs56235854 and rs2735738. In someembodiments, the presence of more than one SNP is determined. In someembodiments, the SNP is rs35705950.

In some embodiments, the genetic variant MUC5B gene comprises a firstsingle nucleotide polymorphism (SNP) and a second SNP. In someembodiments, the first SNP is present within a first DNA strand and saidsecond SNP is present within a second DNA strand. In some embodiments,the first and second SNP are present within the same DNA strand.

In some embodiments, the determining comprises use of at least onesequence selected from the group consisting of SEQ ID NOs:20-53 todetermine whether the genome of the subject comprises a genetic variantMUC5B gene, e.g., by using an appropriate nucleic acid assay to detectthe variant nucleotide in the selected sequence. For example, thedetermining can comprise use of one or more of the sequences of SEQ IDNOs:20-53 in an RT-PCR, array hybridization, or other appropriate SNPdetection method as described herein. In some embodiments, thedetermining comprises (i) contacting a sample from the subject with anucleic acid probe having at least 10 contiguous nucleotides of at leastone of the sequences selected from SEQ ID NOs:20-53, or its complement,wherein said 10 contiguous nucleotides span the genetic variantnucleotide (i.e., the position of the SNP shown for each sequence), and(ii) determining whether the nucleic acid probe hybridizes to a nucleicacid in the sample. In some embodiments, the at least one sequenceincludes SEQ ID NO:24, wherein the presence of a T at position 28 of SEQID NO:24 indicates a genetic variant MUC5B gene, and that the subjecthas or will have an attenuated form of the pulmonary disease. Thepresence of a G at position 28 of SEQ ID NO:24 indicates that thesubject has or will have a more severe form of the pulmonary disease(e.g., where the subject is homozygous for G at position 28, or lackinga genetic variant promoter MUC5B gene).

In some embodiments, the method further comprises determining whethersaid individual expresses an elevated MUC5B RNA level or an elevatedMUC5B protein level relative to a standard control, wherein the presenceof said elevated MUC5B RNA level or said elevated MUC5B protein levelfurther indicates said subject has or is at risk of developing apulmonary disease. Said step of determining can be carried out asdiscussed above.

In some embodiments, the method does not comprise determining whetherthe individual expresses an elevated level of MUC5AC RNA or protein. Insome embodiments, the method does not comprise determining whether saidindividual expresses an elevated level of a second RNA or protein otherthan a MUC5B RNA or protein. In some embodiments, the method does notcomprise determining whether said individual expresses an elevated levelof a second RNA or protein other than a MUC5B RNA or protein, unlesssaid second RNA or protein is a MUC5AC RNA or protein.

In some embodiments, the method further comprises administering atreatment to the subject, e.g., a pulmonary disease treatment, orinterstitial lung disease treatment. In some embodiments, the treatmentis a mucolytic agent. In some embodiments, the treatment is a MUC5Bantagonist, e.g., small molecule that inhibits MUC5B production oractivity. In some embodiments, the method further comprises determininga second MUC5B RNA level or MUC5B protein level after administering saidtreatment and comparing said second level to the level observed beforeadministering said treatment.

In some embodiments, the method further comprises determining whetherthe genome of the subject comprises at least one additional geneticvariant pulmonary disease marker gene. In some embodiments, the at leastone additional pulmonary disease marker can be selected from the groupconsisting of Surfactant Protein A2, Surfactant Protein B, SurfactantProtein C, TERC, TERT, IL-1RN, IL-1α, IL-1β, TNF, Lymphotoxin a,TNF-RII, IL-10, IL-6, IL-12, IFNγ, TGFβ, CR1, ACE, IL-8, CXCR1, CXCR2,MUC1 (KL6), or MUC5AC. The presence of an additional genetic variantpulmonary disease marker gene can indicate that the subject is at riskof or has a pulmonary disease.

In some embodiments, the presence of the genetic variant MUC5B geneindicates that the subject has an attenuated form of the pulmonarydisease. That is, the subject will have a reduced severity of symptoms,more gradual loss of lung function, or increased survival compared tothe normal, non-attenuated form of the pulmonary disease, i.e., comparedto the pulmonary disease as it occurs in an individual that does nothave a genetic variant MUC5B gene.

Thus, in some embodiments, the invention provides methods of prognosinga pulmonary disease in a patient, said method comprising determiningwhether a genome of a subject comprises a genetic variant MUC5B gene,wherein the presence of said genetic variant MUC5B gene indicates anattenuated form of said pulmonary disease in said patient relative tothe absence of said genetic variant MUC5B gene. The absence of a geneticvariant MUC5B gene can indicate that the patient has a more aggressiveform of said pulmonary disease. In some embodiments, the pulmonarydisease is an interstitial lung disease, e.g., a fibrotic interstitiallung disease, such as idiopathic pulmonary fibrosis or familialinterstitial pneumonia. Said genetic variant MUC5B gene can be asdescribed above.

In some embodiments, the method further comprises setting a course oftreatment for the subject, e.g., based on the presence of a geneticvariant MUC5B gene in the subject. For example, the presence of agenetic variant MUC5B gene, or the level of MUC5B gene expression, canbe determined in the subject, a treatment administered to the subject,and the progress of the subject monitored, e.g., by monitoring MUC5Bexpression over time or other pulmonary diagnostic indicators, anddetermining whether further treatment is necessary. Thus, in someembodiments, the method further comprises administering pulmonarydisease treatment or an interstitial lung disease treatment to thesubject. In some embodiments, the method further comprises determiningwhether the genome of the subject comprises a genetic variant MUC5Bgene, wherein the presence of a genetic variant MUC5B gene indicates anattenuated form of said interstitial lung disease in said subject.

In some embodiments, the invention provides methods of determiningwhether a pulmonary disease is progressing in pulmonary disease patient,said method comprising: (i) determining a first level of MUC5B RNA orfirst level of MUC5B protein in said patient at a first time point; (ii)determining a second level of MUC5B RNA or second level of MUC5B proteinin said patient at a second time point; and (iii) comparing the secondlevel of MUC5B RNA to the first level of MUC5B RNA or comparing thesecond level of MUC5B protein to the first level of MUC5B protein,wherein if the second level of MUC5B RNA is greater than the first levelof MUC5B RNA or if the first level of MUC5B protein is greater than thefirst level of MUC5B protein, the pulmonary disease is progressing inthe patient. In some embodiments, the pulmonary disease is aninterstitial lung disease, e.g., a fibrotic interstitial lung disease,such as idiopathic pulmonary fibrosis or familial interstitialpneumonia.

In some embodiments, the method further comprises determining the rateof progression based on said comparing. That is, an rapid increase inMUC5B expression in a short time is correlated with more rapidprogression of the pulmonary disease. In some embodiments, saiddetermining said first level of MUC5B RNA or first level of MUC5Bprotein and said second level of MUC5B RNA or second level of MUC5Bprotein comprises normalizing said first level of MUC5B RNA or firstlevel of MUC5B protein and said second level of MUC5B RNA or secondlevel of MUC5B protein to a level of RNA or protein expressed from astandard gene in said interstitial lung disease patient, e.g., GAPDH,beta-actin, HPRT1, beta-tubulin, or beta-20 microglobulin.

In some embodiments, the invention provides methods of treating,preventing, or ameliorating a pulmonary disease in a subject in needthereof, the method comprising administering to said patient aneffective amount of a MUC5B antagonist, wherein said antagonist reducesthe expression of the MUC5B gene or reduces the activity of the MUC5Bprotein as compared to the expression or activity in the absence of saidMUC5B antagonist, thereby treating, preventing, or ameliorating thepulmonary disease in the subject. In some embodiments, the MUC5Bantagonist is a nucleic acid, e.g., a pRNA, siRNA, or antisensesequence, and reduces expression of the MUC5B gene. In some embodiments,the MUC5B antagonist is a small molecule, e.g., that reduces translationof MUC5B mRNA or packaging or activity of the MUC5B protein. In someembodiments, the MUC5B antagonist is selected from the group consistingof: a MUC5B antibody or MUC5B-binding fragment thereof, a MUC5B-bindingaptamer, and a mucolytic agent. In some embodiments, the MUC5Bantagonist nucleic acid is capable of hybridizing to at least a10-nucleotide contiguous sequence of a MUC5B encoding target nucleicacid sequence. In some embodiments, the method further comprisesmonitoring the subject, e.g., by determining the level of MUC5B RNA orprotein before and after said administering, or at one or more timepoints after said administering. Thus, in some embodiments, the methodof treatment includes a step of determining whether the genome of thesubject comprises a genetic variant MUC5B gene, and/or a step ofdetermining whether the subject has an elevated level of MUC5B RNA orprotein, as described herein.

In some embodiments, the invention provides methods of identifying acandidate pulmonary disease treatment compound, said method comprising:(i) contacting a test compound with a MUC5B protein; (ii) allowing saidtest compound to inhibit the activity of said MUC5B protein; and (iii)selecting the test compound that inhibits the activity of said MUC5Bprotein, thereby identifying a candidate pulmonary disease treatmentcompound. In some embodiments, the method is carried out in vivo, e.g.,in an animal model for pulmonary disease. In some embodiments, themethod is carried out in vitro.

In some embodiments, the invention provides methods of identifying acandidate pulmonary disease treatment compound, said method comprising:(i) contacting a test compound with a MUC5B secreting cell; (ii)allowing said test compound to inhibit secretion of MUC5B protein fromsaid MUC5B secreting cell; and (iii) selecting the test compound thatinhibits secretion of MUC5B protein from said MUC5B secreting cell,thereby identifying a candidate pulmonary disease treatment compound. Insome embodiments, said MUC5B secreting cell is in vitro. In someembodiments, said MUC5B secreting cell forms part of a pulmonary tissue.In some embodiments, said pulmonary tissue forms part of an organism,i.e., the method is carried out in vivo. In some embodiments, theorganism is a mammal, e.g., an animal model or a human.

The invention further provides kits, e.g., for determining whether asubject expresses an elevated level of MUC5B RNA or MUC5B protein, orcarries a genetic variant MUC5B gene. In some embodiments, the kitcomprises (a) a MUC5B binding agent capable of binding to a substanceselected from the group consisting of (i) a genetic variant MUC5B genesequence; (ii) a MUC5B RNA or fragment thereof; and (iii) a MUC5Bprotein or fragment thereof, and (b) a detecting reagent or a detectingapparatus capable of indicating binding of said MUC5B binding agent tosaid substance. In some embodiments, the MUC5B binding agent is labeled,e.g., with a fluorescent label or radioisotope. In some embodiments, thekit further comprises a sample collection device for collecting a samplefrom the subject. In some embodiments, the MUC5B binding agent binds agenetic variant MUC5B gene in the promoter region. In some embodiments,the kit further comprises at least one control sample, e.g., anon-variant MUC5B gene sequence or a sample from a normal, non-diseasecontrol.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 represents a flow chart related to the genetic study designdescribed herein.

FIG. 2 represents a Multipoint LOD score graphs for whole genome screen(884 markers with an average inter-marker distance of 4.2 centimorgans(CM)) in 82 families with two or more cases of IIP.

FIG. 3 illustrates pair-wise linkage disequilibrium (LD) plot for SNPssignificantly associated with IPF or FIP by allelic association test ingenetic screen of lung-expressed gel-forming mucins. LD values displayedare calculated by the r2 statistic for the mucin genetic screen IPFsubjects (n=492). Multi-colored graphic about the plot indicates theapproximate location of these SNPs within the gel-forming mucin region.The highly significant MUC5B promoter SNP (rs35705950) and thecorresponding pairwise LD values are highlighted in red. Intergenicregion is abbreviated as Int, and the MUC5B Promoter is abbreviated asPr. LD patterns were qualitatively similar in the controls although inmost instances the LD was weaker among controls.

FIGS. 4A-4C represent illustrations of MUC5B gene expression in IPF(N=33) and unaffected subjects (N=47) stratified by MUC5B promoter SNP(rs35705950) genotype and smoking status. A. MUC5B gene expression amongunaffected and IPF subjects colored coded based on whether subjects arewildtype (dark grey) or heterozygous for the MUC5B promoter SNP (lightgrey). B. Comparison of MUC5B expression in unaffected subjects, amongunaffected smokers only, and among unaffected non-smokers only, by MUC5Bpromoter SNP genotype. C. Comparison of MUC5B expression in all IPFsubjects, among IPF smokers only, and among IPF non-smokers only, byMUC5B promoter SNP genotype. Lines represent group medians and theexpression of MUC5B is determined relative to GAPDH expression

FIGS. 5A-5C represent MUC5B immunohistochemistry of unaffected and IPFtissue. Tissue sections stained for MUC5B distribution in both theunaffected and IPF lung show strong specific cytoplasmic staining withinsecretory columnar cells of the bronchi and larger proximal bronchioles(FIG. 5A). In subjects with IPF, regions of dense accumulation of MUC5Bwere observed in areas of microscopic honeycombing and involved patchystaining of the metaplastic epithelia lining the honeycomb cysts (FIG.5B), as well as the mucus plugs within the cysts (FIG. 5C).

DETAILED DESCRIPTION OF THE INVENTION

The invention provides novel methods and compositions for diagnosing andpredicting the severity of pulmonary disease, and a novel therapeutictarget for ameliorating pulmonary disease. The inventors have found thatindividuals carrying genetic variants of the MUC5B gene that haveelevated expression of the gene have an increased likelihood ofdeveloping a pulmonary disease, e.g., an interstitial lung disease suchas fibrotic interstitial lung disease, idiopathic pulmonary fibrosis,familial interstitial pneumonia, etc. The presence of some geneticvariations in the MUC5B gene, while increasing the likelihood of apulmonary disease, are indicative of an attenuated form of the disease,e.g., a more gradual progression of symptoms and improved survival.

I. DEFINITIONS

The terms “pulmonary disease,” “pulmonary disorder,” “lung disease,”etc. are used interchangeably herein. The term is used to broadly referto lung disorders characterized by difficulty breathing, coughing,airway discomfort and inflammation, increased mucus, and/or pulmonaryfibrosis.

Mucins are a family of high molecular weight, heavily glycosylatedproteins (glycoproteins) produced by mammalian epithelia. Secreted,gel-forming mucins form a component of mucus. Typically, the N- andC-terminal ends of mucin proteins are lightly glycosylated, but rich indi-sulfide bond-forming cysteine residues.

Mucin 5b (MUC5B) is a gel-forming mucin expressed in airway epithelialtissue. Additional gel-forming mucins, MUC2, MUC5AC, and MUC6, have beenmapped to the same chromosomal region on human chromosome 11. MUC5B isfurther characterized in Desseyn et al. (1996) J. Biol. Chem.273:30157-64.

The term “genetic variant,” in the context of a particular gene, refersa gene with a variant (e.g., non-standard or abnormal) nucleic acidsequence. The gene includes coding and non-coding sequences, such asregulatory regions. Genetic variants include mutations and polymorphicsequences. Thus, the genetic variant may affect the expression oractivity of the gene or gene product. The genetic variant may be aninsertion of one or more nucleotides, deletion of one or morenucleotides, or a substitution of one or more nucleotides. A singlenucleotide polymorphism (SNP) is an example of a genetic variant.

The term “genetic variant MUC5B gene” refers to a MUC5B genetic variant(a MUC5B gene with a genetic variation as described above). The term“genetic variant promoter MUC5B gene” refers to a variation that isspecifically in the promoter region of the MUC5B gene. Similarly,“genetic variant regulatory region MUC5B gene” and “genetic variantintronic MUC5B gene” localize the variation within the MUC5B gene. Anexample of a genetic variant MUC5B gene is rs35705950, which includes aSNP in the promoter region.

An “airway mucosal sample” can be obtained using methods known in theart, e.g., a bronchial epithelial brush as described herein. Additionalmethods include endobronchial biopsy, bronchial wash, bronchoalveolarlavage, whole lung lavage, transendoscopic biopsy, and transtrachealwash.

The terms “subject,” “patient,” “individual,” etc. are not intended tobe limiting and can be generally interchanged. That is, an individualdescribed as a “patient” does not necessarily have a given disease, butmay be merely seeking medical advice.

A “control” sample or value refers to a sample that serves as areference, usually a known reference, for comparison to a test sample.For example, a test sample can be taken from a patient suspected ofhaving a given pulmonary disease and compared to samples from a knownpulmonary disease patient, known genetic variant MUC5B carrier, or aknown normal (non-disease) individual. A control can also represent anaverage value gathered from a population of similar individuals, e.g.,pulmonary disease patients or healthy individuals with a similar medicalbackground, same age, weight, etc. A control value can also be obtainedfrom the same individual, e.g., from an earlier-obtained sample, priorto disease, or prior to treatment. One of skill will recognize thatcontrols can be designed for assessment of any number of parameters.

One of skill in the art will understand which controls are valuable in agiven situation and be able to analyze data based on comparisons tocontrol values. Controls are also valuable for determining thesignificance of data. For example, if values for a given parameter arewidely variant in controls, variation in test samples will not beconsidered as significant.

As used herein, the terms “pharmaceutically” acceptable is usedsynonymously with physiologically acceptable and pharmacologicallyacceptable. A pharmaceutical composition will generally comprise agentsfor buffering and preservation in storage, and can include buffers andcarriers for appropriate delivery, depending on the route ofadministration.

The terms “dose” and “dosage” are used interchangeably herein. A doserefers to the amount of active ingredient given to an individual at eachadministration. For the present invention, the dose will generally referto the amount of pulmonary disease treatment, anti-inflammatory agent,or MUC5B antagonist. The dose will vary depending on a number offactors, including the range of normal doses for a given therapy,frequency of administration; size and tolerance of the individual;severity of the condition; risk of side effects; and the route ofadministration. One of skill will recognize that the dose can bemodified depending on the above factors or based on therapeuticprogress. The term “dosage form” refers to the particular format of thepharmaceutical, and depends on the route of administration. For example,a dosage form can be in a liquid form for nebulization, e.g., forinhalants, in a tablet or liquid, e.g., for oral delivery, or a salinesolution, e.g., for injection.

As used herein, the terms “treat” and “prevent” are not intended to beabsolute terms. Treatment can refer to any delay in onset, reduction inthe frequency or severity of symptoms, amelioration of symptoms,improvement in patient comfort and/or respiratory function, etc. Theeffect of treatment can be compared to an individual or pool ofindividuals not receiving a given treatment, or to the same patientprior to, or after cessation of, treatment.

The term “prevent” refers to a decrease in the occurrence of pulmonarydisease symptoms in a patient. As indicated above, the prevention may becomplete (no detectable symptoms) or partial, such that fewer symptomsare observed than would likely occur absent treatment.

The term “therapeutically effective amount,” as used herein, refers tothat amount of the therapeutic agent sufficient to ameliorate thedisorder, as described above. For example, for the given parameter, atherapeutically effective amount will show an increase or decrease of atleast 5%, 10%, 15%, 20%, 25%, 40%, 50%, 60%, 75%, 80%, 90%, or at least100%. Therapeutic efficacy can also be expressed as “-fold” increase ordecrease. For example, a therapeutically effective amount can have atleast a 1.2-fold, 1.5-fold, 2-fold, 5-fold, or more effect over acontrol.

The term “diagnosis” refers to a relative probability that a pulmonarydisease is present in the subject. Similarly, the term “prognosis”refers to a relative probability that a certain future outcome may occurin the subject. For example, in the context of the present invention,prognosis can refer to the likelihood that an individual will develop apulmonary disease, or the likely severity of the disease (e.g., severityof symptoms, rate of functional decline, survival, etc.). The terms arenot intended to be absolute, as will be appreciated by any one of skillin the field of medical diagnostics.

The terms “correlating” and “associated,” in reference to determinationof a pulmonary disease risk factor, refers to comparing the presence oramount of the risk factor (e.g., dysregulation or genetic variation in amucin gene) in an individual to its presence or amount in persons knownto suffer from, or known to be at risk of, the pulmonary disease, or inpersons known to be free of pulmonary disease, and assigning anincreased or decreased probability of having/developing the pulmonarydisease to an individual based on the assay result(s).

“Nucleic acid” or “oligonucleotide” or “polynucleotide” or grammaticalequivalents used herein means at least two nucleotides covalently linkedtogether. Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10,12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100nucleotides in length. Nucleic acids and polynucleotides are a polymersof any length, including longer lengths, e.g., 200, 300, 500, 1000,2000, 3000, 5000, 7000, 10,000, etc. A nucleic acid of the presentinvention will generally contain phosphodiester bonds, although in somecases, nucleic acid analogs are included that may have alternatebackbones, comprising, e.g., phosphoramidate, phosphorothioate,phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein,Oligonucleotides and Analogues: A Practical Approach, Oxford UniversityPress); and peptide nucleic acid backbones and linkages. Other analognucleic acids include those with positive backbones; non-ionicbackbones, and non-ribose backbones, including those described in U.S.Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC SymposiumSeries 580, Carbohydrate Modifications in Antisense Research, Sanghui &Cook, eds. Nucleic acids containing one or more carbocyclic sugars arealso included within one definition of nucleic acids. Modifications ofthe ribose-phosphate backbone may be done for a variety of reasons,e.g., to increase the stability and half-life of such molecules inphysiological environments or as probes on a biochip. Mixtures ofnaturally occurring nucleic acids and analogs can be made;alternatively, mixtures of different nucleic acid analogs, and mixturesof naturally occurring nucleic acids and analogs may be made.

The terms “identical” or percent “identity,” in the context of two ormore nucleic acids (e.g., genomic sequences or subsequences, such asshown in SEQ ID NOs:20-53, or coding sequences) or polypeptidesequences, refer to two or more sequences or subsequences that are thesame or have a specified percentage of amino acid residues ornucleotides that are the same (i.e., 50%, 55%, 60%, 65%, 70%, 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more identityover a specified region), when compared and aligned for maximumcorrespondence over a comparison window, or designated region asmeasured using one of the following sequence comparison algorithms or bymanual alignment and visual inspection. Such sequences are then said tobe “substantially identical.” This definition also refers to thecompliment of a test sequence. Optionally, the identity exists over aregion that is at least about 10 to about 100, about 20 to about 75,about 30 to about 50 amino acids or nucleotides in length.

An example of algorithms suitable for determining percent sequenceidentity and sequence similarity are the BLAST and BLAST 2.0 algorithms,which are described in Altschul et al., Nuc. Acids Res. 25:3389-3402(1977) and Altschul et al., J. Mol. Biol. 215:403-410 (1990),respectively. As will be appreciated by one of skill in the art, thesoftware for performing BLAST analyses is publicly available through thewebsite of the National Center for Biotechnology Information(ncbi.nlm.nih.gov).

The terms “polypeptide,” “peptide” and “protein” are usedinterchangeably herein to refer to a polymer of amino acid residues. Theterms apply to amino acid polymers in which one or more amino acidresidue is an artificial chemical mimetic of a corresponding naturallyoccurring amino acid, as well as to naturally occurring amino acidpolymers, those containing modified residues, and non-naturallyoccurring amino acid polymer.

The term “amino acid” refers to naturally occurring and synthetic aminoacids, as well as amino acid analogs and amino acid mimetics thatfunction similarly to the naturally occurring amino acids. Naturallyoccurring amino acids are those encoded by the genetic code, as well asthose amino acids that are later modified, e.g., hydroxyproline,γ-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers tocompounds that have the same basic chemical structure as a naturallyoccurring amino acid, e.g., an α carbon that is bound to a hydrogen, acarboxyl group, an amino group, and an R group, e.g., homoserine,norleucine, methionine sulfoxide, methionine methyl sulfonium. Suchanalogs may have modified R groups (e.g., norleucine) or modifiedpeptide backbones, but retain the same basic chemical structure as anaturally occurring amino acid. Amino acid mimetics refers to chemicalcompounds that have a structure that is different from the generalchemical structure of an amino acid, but that functions similarly to anaturally occurring amino acid.

Amino acids may be referred to herein by either their commonly knownthree letter symbols or by the one-letter symbols recommended by theIUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise,may be referred to by their commonly accepted single-letter codes.

“Conservatively modified variants” applies to both amino acid andnucleic acid sequences. With respect to particular nucleic acidsequences, conservatively modified variants refers to those nucleicacids which encode identical or essentially identical amino acidsequences, or where the nucleic acid does not encode an amino acidsequence, to essentially identical or associated, e.g., naturallycontiguous, sequences. Because of the degeneracy of the genetic code, alarge number of functionally identical nucleic acids encode mostproteins. For instance, the codons GCA, GCC, GCG and GCU all encode theamino acid alanine. Thus, at every position where an alanine isspecified by a codon, the codon can be altered to another of thecorresponding codons described without altering the encoded polypeptide.Such nucleic acid variations are “silent variations,” which are onespecies of conservatively modified variations. Every nucleic acidsequence herein which encodes a polypeptide also describes silentvariations of the nucleic acid. One of skill will recognize that incertain contexts each codon in a nucleic acid (except AUG, which isordinarily the only codon for methionine, and TGG, which is ordinarilythe only codon for tryptophan) can be modified to yield a functionallyidentical molecule. Accordingly, often silent variations of a nucleicacid which encodes a polypeptide is implicit in a described sequencewith respect to the expression product, but not with respect to actualprobe sequences.

As to amino acid sequences, one of skill will recognize that individualsubstitutions, deletions or additions to a nucleic acid, peptide,polypeptide, or protein sequence which alters, adds or deletes a singleamino acid or a small percentage of amino acids in the encoded sequenceis a “conservatively modified variant” where the alteration results inthe substitution of an amino acid with a chemically similar amino acid.Conservative substitution tables providing functionally similar aminoacids are well known in the art. Such conservatively modified variantsare in addition to and do not exclude polymorphic variants, interspecieshomologs, and alleles of the invention. typically conservativesubstitutions for one another: 1) Alanine (A), Glycine (G); 2) Asparticacid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4)Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7)Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see,e.g., Creighton, Proteins (1984)).

A “label” or a “detectable moiety” is a composition detectable byspectroscopic, photochemical, biochemical, immunochemical, chemical, orother physical means. For example, useful labels include ³²P,fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonlyused in an ELISA), biotin, digoxigenin, or haptens and proteins or otherentities which can be made detectable, e.g., by incorporating aradiolabel into a peptide or antibody specifically reactive with atarget peptide. Any method known in the art for conjugating an antibodyto the label may be employed, e.g., using methods described inHermanson, Bioconjugate Techniques 1996, Academic Press, Inc., SanDiego.

A “labeled nucleic acid probe or oligonucleotide” is one that is bound,either covalently, through a linker or a chemical bond, ornoncovalently, through ionic, van der Waals, electrostatic, or hydrogenbonds to a label such that the presence of the probe may be detected bydetecting the presence of the label bound to the probe. Alternatively,method using high affinity interactions may achieve the same resultswhere one of a pair of binding partners binds to the other, e.g.,biotin, streptavidin.

The phrase “selectively (or specifically) hybridizes to” refers to thebinding, duplexing, or hybridizing of a molecule only to a particularnucleotide sequence with a higher affinity, e.g., under more stringentconditions, than to other nucleotide sequences (e.g., total cellular orlibrary DNA or RNA). One of skill in the art will appreciate thatspecific hybridization between nucleotides usually relies onWatson-Crick pair bonding between complementary nucleotide sequences.

The term “probe” or “primer”, as used herein, is defined to be one ormore nucleic acid fragments whose specific hybridization to a sample canbe detected. A probe or primer can be of any length depending on theparticular technique it will be used for. For example, PCR primers aregenerally between 10 and 40 nucleotides in length, while nucleic acidprobes for, e.g., a Southern blot, can be more than a hundrednucleotides in length. The probe may be unlabeled or labeled asdescribed below so that its binding to the target or sample can bedetected. The probe can be produced from a source of nucleic acids fromone or more particular (preselected) portions of a chromosome, e.g., oneor more clones, an isolated whole chromosome or chromosome fragment, ora collection of polymerase chain reaction (PCR) amplification products.The length and complexity of the nucleic acid fixed onto the targetelement is not critical to the invention. One of skill can adjust thesefactors to provide optimum hybridization and signal production for agiven hybridization procedure, and to provide the required resolutionamong different genes or genomic locations.

The probe may also be isolated nucleic acids immobilized on a solidsurface (e.g., nitrocellulose, glass, quartz, fused silica slides), asin an array. In some embodiments, the probe may be a member of an arrayof nucleic acids as described, for instance, in WO 96/17958. Techniquescapable of producing high density arrays can also be used for thispurpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr.Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern(1997) Biotechniques 23: 120-124; U.S. Pat. No. 5,143,854). One of skillwill recognize that the precise sequence of the particular probesdescribed herein can be modified to a certain degree to produce probesthat are “substantially identical” to the disclosed probes, but retainthe ability to specifically bind to (i.e., hybridize specifically to)the same targets or samples as the probe from which they were derived.Such modifications are specifically covered by reference to theindividual probes described herein.

“Antibody” refers to a polypeptide comprising a framework region from animmunoglobulin gene or fragments thereof that specifically binds andrecognizes an antigen, e.g., a specific bacterial antigen. Typically,the “variable region” contains the antigen-binding region of theantibody (or its functional equivalent) and is most critical inspecificity and affinity of binding. See Paul, Fundamental Immunology(2003).

An exemplary immunoglobulin (antibody) structural unit comprises atetramer. Each tetramer is composed of two identical pairs ofpolypeptide chains, each pair having one “light” (about 25 kD) and one“heavy” chain (about 50-70 kD). The N-terminus of each chain defines avariable region of about 100 to 110 or more amino acids primarilyresponsible for antigen recognition. The terms variable light chain(V_(L)) and variable heavy chain (V_(H)) refer to these light and heavychains respectively.

Antibodies can exist as intact immunoglobulins or as any of a number ofwell-characterized fragments that include specific antigen-bindingactivity. Such fragments can be produced by digestion with variouspeptidases. Pepsin digests an antibody below the disulfide linkages inthe hinge region to produce F(ab)′₂, a dimer of Fab which itself is alight chain joined to V_(H)-C_(H)1 by a disulfide bond. The F(ab)′₂ maybe reduced under mild conditions to break the disulfide linkage in thehinge region, thereby converting the F(ab)′₂ dimer into an Fab′ monomer.The Fab′ monomer is essentially Fab with part of the hinge region (seeFundamental Immunology (Paul ed., 3d ed. 1993). While various antibodyfragments are defined in terms of the digestion of an intact antibody,one of skill will appreciate that such fragments may be synthesized denovo either chemically or by using recombinant DNA methodology. Thus,the term antibody, as used herein, also includes antibody fragmentseither produced by the modification of whole antibodies, or thosesynthesized de novo using recombinant DNA methodologies (e.g., singlechain Fv) or those identified using phage display libraries (see, e.g.,McCafferty et al., Nature 348:552-554 (1990)).

II. MUCINS

There are several gel-forming mucins including, but not limited to,MUC6, MUC2, MUC5AC, and MUC5B. These proteins are large filamentous andhighly O-glycosylated.

III. PULMONARY DISEASES

The pulmonary diseases contemplated herein can include any pulmonarydisorders, lung fibrosis diseases, interstitial lung diseases,idiopathic interstitial pneumonias (IIP), idiopathic pulmonary fibrosis,familial interstitial pneumonia (FIP), acute respiratory distresssyndrome (ARDS), scleroderma lung disease, Sarcoidosis, Berylliumdisease, rheumatoid arthritis associated lung disorder, collagenvascular associated lung disorder, cigarette smoke associated lungdisorders, Sjögren's syndrome, mixed connective tissue disease,nonspecific interstitial pneumonitis (NSIP), etc.

Pulmonary fibrotic conditions, e.g., interstitial lung diseases (ILD)are characterized by shortness of breath, chronic coughing, fatigue andweakness, loss of appetite, and rapid weight loss. Pulmonary fibrosis iscommonly linked to interstitial lung diseases (e.g., autoimmunedisorders, viral infections or other microscopic injuries), but can beidiopathic. Fibrosis involves exchange of normal lung tissue withfibrotic tissue (scar tissue) that leads to reduced oxygen capacity.

Idiopathic interstitial pneumonias (IIP) are a subset of diffuseinterstitial lung diseases of unknown etiology (the term “idiopathic”indicates unknown origin). IIPs are characterized by expansion of theinterstitial compartment (i.e., that portion of the lung parenchymasandwiched between the epithelial and endothelial basement membranes)with an infiltrate of inflammatory cells. The inflammatory infiltrate issometimes accompanied by fibrosis, either in the form of abnormalcollagen deposition or proliferation of fibroblasts capable of collagensynthesis.

Idiopathic Pulmonary Fibrosis (IPF) occurs in thousands of peopleworldwide with a doubling of prevalence over the past 10 years. Onset ofIPF occurs around 50 to 70 years of age and starts with progressiveshortness of breath and hypoxemia. IPF median survival is around 3-5years and is to date untreatable. The etiology and pathogenesis of thecondition is not well understood. About 5-20 percent of all cases of IPFhave a family history and inheritance appears to be autosomal dominant.

Additional fibrotic pulmonary diseases include Acute InterstitialPneumonia (AIP), Respiratory Bronchiolitis-associated Interstitial LungDisease (RBILD), Desquamative Interstitial Pneumonia (DIP), Non-SpecificInterstitial Pneumonia (NSIP), Bronchiolitis obliterans, with OrganizingPneumonia (BOOP).

AIP is a rapidly progressive and histologically distinct form ofinterstitial pneumonia. The pathological pattern is an organizing formof diffuse alveolar damage (DAD) that is also found in acute respiratorydistress syndrome (ARDS) and other acute interstitial pneumonias ofknown causes (see Clinical Atlas of Interstitial Lung Disease (2006 ed.)pp 61-63).

RBILD is characterized by inflammatory lesions of the respiratorybronchioles in cigarette smokers. The histologic appearance of RBILD ischaracterized by the accumulation of pigmented macrophages within therespiratory bronchioles and the surrounding airspaces, variably,peribronchial fibrotic alveolar septal thickening, and minimalassociated mural inflammation (see Wells et al. (2003) Sem Respir. Crit.Care Med. vol. 24).

DIP is a rare interstitial lung disease characterized by theaccumulation of macrophages in large numbers in the alveolar spacesassociated with interstitial inflammation and/or fibrosis. Themacrophages frequently contain light brown pigment. Lymphoid nodules arecommon, as is a sparse but distinct eosinophil infiltrate. DIP is mostcommon in smokers (see Tazelaar et al. (Sep. 21, 2010) Histopathology).

NSIP is characterized pathologically by uniform interstitialinflammation and fibrosis appearing over a short period of time. NSIPdiffers from other interstitial lung diseases in that it has a generallygood prognosis. In addition, the temporal uniformity of the parenchymalchanges seen in NSIP contrasts greatly with the temporal heterogeneityof usual interstitial pneumonia (see Coche et al. (2001) Brit J Radiol74:189).

BOOP, unlike NSIP, can be fatal within days of first acute symptoms. Itis characterized by rapid onset of acute respiratory distress syndrome;therefore, clinically, rapidly progressive BOOP can be indistinguishablefrom acute interstitial pneumonia. Histological features includeclusters of mononuclear inflammatory cells that form granulation tissueand plug the distal airways and alveolar spaces. These plugs ofgranulation tissue may form polyps that migrate within the alveolarducts or may be focally attached to the wall (see White & Ruth-Saad(2007) Crit. Care Nurse 27:53).

Further details about the characteristics and therapies available forthese diseases can be found, e.g., on the website of the American LungAssociation at lungusa.org/lung-disease/pulmonary-fibrosis.

Diagnostic indicators of pulmonary disorders include biopsy (e.g., VATSor surgical lung biopsy), high resolution computed tomography (HRTC) orbreathing metrics, such as forced expiratory volume (FEV1), vitalcapacity (VC), forced vital capacity (FVC), and FEV1/FVC.

Additional disorders associated with MUC5B expression and/or SNPsassociated with MUC5B (e.g. SNP rs35705950) can include, but are notlimited to, mucous secretion disorders, cancers (e.g. ovarian, breastlung, pancreatic etc.), eye disease, colitis, and cirrhosis of theliver.

IV. METHODS OF DIAGNOSIS AND PROGNOSIS

Methods for detecting and identifying nucleic acids and proteins andinteractions between such molecules involve conventional molecularbiology, microbiology, and recombinant DNA techniques within the skillof the art. Such techniques are explained fully in the literature (see,e.g., Sambrook, Fritsch & Maniatis, Molecular Cloning: A LaboratoryManual, Second Edition 1989, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y.; Animal Cell Culture, R. I. Freshney, ed., 1986).

A. Biological Samples

For detection of a genetic variant using genomic DNA, a biologicalsample can be obtained from nearly any tissue. One of skill in the artwill understand that a blood sample or a cheek swab is expected to carrythe same genetic sequence information as a lung cell. For detection of agiven expression level, pulmonary tissue samples and other biologicalfluids are typically used.

Biological samples can include a pulmonary mucosal sample or biologicalfluid such as blood or blood components (plasma, serum), sputum, mucus,urine, saliva, etc.

A pulmonary mucosal sample can be obtained using methods known in theart, e.g., a bronchial epithelial brush or exhaled breath condensate.Additional methods include bronchial biopsy, bronchial wash,bronchoalveolar lavage, whole lung lavage, transendoscopic biopsy,translaryngoscopic catheter, and transtracheal wash. A review ofcommonly used techniques, including comparisons and safety issues, isprovided in Busse et al. (2005) Am J Respir Crit Care Med 172:807-816.

For lavage techniques, a bronchoscope can be inserted to the desiredlevel of the airway. A small volume of sterile, physiologicallyacceptable fluid (e.g., buffered saline) is released, and immediatelyaspirated. The wash material contains cells from the mucosa and upperepithelia (Riise et al. (1996) Eur Resp J 9:1665).

For use of a bronchial epithelial brush, a sterile, non-irritating(e.g., nylon) cytology brush can be used. Multiple brushings can betaken to ensure representative sampling. The brush is then agitated inphysiologically acceptable fluid, and the cells and debris separatedusing routine methods (Riise et al. (1992) Eur Resp J 5:382).

Cellular components can be isolated using methods known in the art,e.g., centrifugation. Similarly, subcellular components (e.g., exosomesor vesicles) can be isolated using known methods or commercialseparation products (available from BioCat, System Bio, Bioscientific,etc.). An exemplary method is described e.g., by Thery et al. (2006)Current Prot. Cell Biol.

B. Detection of Genetic Variants

The inventors have found that genetic variations in the mucin genes areassociated with pulmonary diseases. These genetic variations can befound in any part of the gene, e.g., in the regulatory regions, introns,or exons. Relevant genetic variations may also be found the intergeneregions, e.g., in sequences between mucin genes. Insertions,substitutions, and deletions are included in genetic variants. Singlenucleotide polymorphisms (SNPs) are exemplary genetic variants.

In particular, 14 independent SNPs are associated with pulmonarydisorders (e.g. FIP or IPF). The studies disclosed herein demonstratethat presence of one or more of these SNPs associated with MUC5B canlead to predisposition to a pulmonary disorder. In addition, in someembodiments, if present, some of these SNPs are related to atranscription factor binding site. The transcription factor binding sitecan effect modulation of MUC5B expression, for example E2F3 loss, andHOXA9 and PAX-2 generation.

The invention thus provides methods for assessing the presence orabsence of SNPs in a sample from a subject suspected of having ordeveloping a pulmonary disorder (e.g., because of family history). Incertain embodiments, one or more SNPs are screened in one or moresamples from a subject. The SNPs can be associated with one or moregenes, e.g., one or more MUC genes or other genes associated with mucoussecretion. In some embodiments, a MUC gene associated SNP is associatedwith MUC5B and/or another MUC gene, such as MUC5AC or MUC1. SNPscontemplated for diagnostic, treatment, or prognosis can include SNPsfound within a MUC gene and/or within a regulatory or promoter regionassociated with a MUC gene. For example, one or more SNPs can include,but are not limited to, detection of the SNPs of MUC5B shown in Table 4(SEQ ID NOs:20-53), e.g., SNP rs35705950 (SEQ ID NO:24), alone or incombination with other genetic variations or SNPs and/or otherdiagnostic or prognostic methods.

Methods for detecting genetic variants such as a SNP are known in theart, e.g., Southern or Northern blot, nucleotide array, amplificationmethods, etc. Primers or probes are designed to hybridize to a targetsequence. For example, genomic DNA can be screened for the presence ofan identified genetic element of using a probe based upon one or moresequences, e.g., using a probe with substantial identity to asubsequence of the MUC5B gene, such as one of the subsequences shown inTable 4 (SEQ ID NOs: 20-53). Exemplary human MUC5B genomic sequencesthat can be used for reference and probe and primer design are found atGenBank Accession Nos. AF107890.1 and AJ004862.1. Expressed RNA can alsobe screened, but may not include all relevant genetic variations.Various degrees of stringency of hybridization may be employed in theassay. As the conditions for hybridization become more stringent, theremust be a greater degree of complementarity between the probe and thetarget for duplex formation to occur. Thus, high stringency conditionsare typically used for detecting a SNP.

Thus, in some embodiments, a genetic variant MUC5B gene in a subject isdetected by contacting a nucleic acid in a sample from the subject witha probe having substantial identity to a subsequence of the MUC5B gene,and determining whether the nucleic acid indicates that the subject hasa genetic variant MUC5B gene. In some cases, the sample can be processedprior to amplification, e.g., to separate genomic DNA from other samplecomponents. In some cases, the probe has at least 90, 92, 94, 95, 96,98, 99, or 100% identity to the MUC5B gene subsequence. Typically, theprobe is between 10-500 nucleotides in length, e.g., 10-100, 10-40,10-20, 20-100, 100-400, etc. In the case of detecting a SNP, the probecan be even shorter, e.g., 8-20 nucleotides in length. In some cases,the MUC5B gene sequence to be detected includes at least 8 contiguousnucleotides, e.g., at least 10, 15, 20, 25, 30, 35 or more contiguousnucleotides of one of the sequences shown in SEQ ID NOs:20-53. In someembodiments, the sequence to be detected includes 8 contiguousnucleotides, e.g., at least 10, 15, 20, 25, 30, 35 or more contiguousnucleotides of SEQ ID NO:24. In some aspects, the contiguous nucleotidesinclude nucleotide 28 of SEQ ID NO:24.

The degree of stringency can be controlled by temperature, ionicstrength, pH and/or the presence of a partially denaturing solvent suchas formamide. For example, the stringency of hybridization isconveniently varied by changing the concentration of formamide withinthe range up to and about 50%. The degree of complementarity (sequenceidentity) required for detectable binding will vary in accordance withthe stringency of the hybridization medium and/or wash medium. Incertain embodiments, in particular for detection of a particular SNP,the degree of complementarity is about 100 percent. In otherembodiments, sequence variations can result in <100% complementarity,<90% complimentarity probes, <80% complimentarity probes, etc., inparticular, in a sequence that does not involve a SNP. In some examples,e.g., detection of species homologs, primers may be compensated for byreducing the stringency of the hybridization and/or wash medium.

High stringency conditions for nucleic acid hybridization are well knownin the art. For example, conditions may comprise low salt and/or hightemperature conditions, such as provided by about 0.02 M to about 0.15 MNaCl at temperatures of about 50° C. to about 70° C. Other exemplaryconditions are disclosed in the following Examples. It is understoodthat the temperature and ionic strength of a desired stringency aredetermined in part by the length of the particular nucleic acid(s), thelength and nucleotide content of the target sequence(s), the chargecomposition of the nucleic acid(s), and by the presence or concentrationof formamide, tetramethylammonium chloride or other solvent(s) in ahybridization mixture. Nucleic acids can be completely complementary toa target sequence or exhibit one or more mismatches.

Nucleic acids of interest (e.g., nucleic acids comprising, or comprisedwithin, SEQ ID NOs:20-53) can also be amplified using a variety of knownamplification techniques. For instance, polymerase chain reaction (PCR)technology may be used to amplify target sequences (e.g., geneticvariants) directly from DNA, RNA, or cDNA. In some embodiments, astretch of nucleic acids is amplified using primers on either side of atargeted genetic variation, and the amplification product is thensequenced to detect the targeted genetic variation (using, e.g., Sangersequencing, Pyrosequencing, Nextgen® sequencing technologies). Forexample, the primers can be designed to hybridize to either side of theupstream regulatory region of the MUC5B gene, and the interveningsequence determined to detect a SNP in the promoter region. In someembodiments, one of the primers can be designed to hybridize to thetargeted genetic variant. In some cases, a genetic variant nucleotidecan be identified using RT-PCR, e.g., using labeled nucleotide monomers.In this way, the identity of the nucleotide at a given position can bedetected as it is added to the polymerizing nucleic acid. The Scorpion™system is a commercially available example of this technology.

Thus, in some embodiments, a genetic variant MUC5B gene in a subject isdetected by amplifying a nucleic acid in a sample from the subject toform an amplification product, and determining whether the amplificationproduct indicates a genetic variant MUC5B gene. In some cases, thesample can be processed prior to amplification, e.g., to separategenomic DNA from other sample components. In some cases, amplifyingcomprises contacting the sample with amplification primers havingsubstantial identity to MUC5B genomic subsequences, e.g., at least 90,92, 94, 95, 96, 98, 99, or 100% identity. Typically, the sequence to beamplified is between 30-1000 nucleotides in length, e.g., 50-500,50-400, 100-400, 50-200, 100-300, etc. In some cases, the sequence to beamplified or detected includes at least 8 contiguous nucleotides, e.g.,at least 10, 15, 20, 25, 30, 35 or more contiguous nucleotides of one ofthe sequences shown in SEQ ID NOs:20-53. In some embodiments, thesequence to be amplified or detected includes 8 contiguous nucleotides,e.g., at least 10, 15, 20, 25, 30, 35 or more contiguous nucleotides ofSEQ ID NO:24. In some aspects, the contiguous nucleotides includenucleotide 28 of SEQ ID NO:24.

Amplification techniques can also be useful for cloning nucleic acidsequences, to make nucleic acids to use as probes for detecting thepresence of a target nucleic acid in samples, for nucleic acidsequencing, for control samples, or for other purposes. Probes andprimers are also readily available from commercial sources, e.g., fromInvitrogen, Clonetech, etc.

C. Detection of Expression Levels

Expression of a given gene, e.g., MUC5B or another mucin, pulmonarydisease marker, or standard (control), is typically detected bydetecting the amount of RNA (e.g., mRNA) or protein. Sample levels canbe compared to a control level.

Methods for detecting RNA are largely cumulative with the nucleic aciddetection assays described above. RNA to be detected can include mRNA.In some embodiments, a reverse transcriptase reaction is carried out andthe targeted sequence is then amplified using standard PCR. QuantitativePCR (qPCR) or real time PCR (RT-PCR) is useful for determining relativeexpression levels, when compared to a control. Quantitative PCRtechniques and platforms are known in the art, and commerciallyavailable (see, e.g., the qPCR Symposium website, available atqpersymposium.com). Nucleic acid arrays are also useful for detectingnucleic acid expression. Customizable arrays are available from, e.g.,Affimatrix. An exemplary human MUC5B mRNA sequence, e.g., for probe andprimer design, can be found at GenBank Accession No. AF086604.1.

Protein levels can be detected using antibodies or antibody fragmentsspecific for that protein, natural ligands, small molecules, aptamers,etc. An exemplary human MUC5B sequence, e.g., for screening a targetingagent, can be found at UniProt Accession No. 000446.

Antibody based techniques are known in the art, and described, e.g., inHarlow & Lane (1988) Antibodies: A Laboratory Manual and Harlow (1998)Using Antibodies: A Laboratory Manual; Wild, The Immunoassay Handbook,3d edition (2005) and Law, Immunoassay: A Practical Guide (1996). Theassay can be directed to detection of a molecular target (e.g., proteinor antigen), or a cell, tissue, biological sample, liquid sample orsurface suspected of carrying an antibody or antibody target.

A non-exhaustive list of immunoassays includes: competitive andnon-competitive formats, enzyme linked immunosorption assays (ELISA),microspot assays, Western blots, gel filtration and chromatography,immunochromatography, immunohistochemistry, flow cytometry orfluorescence activated cell sorting (FACS), microarrays, and more. Suchtechniques can also be used in situ, ex vivo, or in vivo, e.g., fordiagnostic imaging.

Aptamers are nucleic acids that are designed to bind to a wide varietyof targets in a non-Watson Crick manner. An aptamer can thus be used todetect or otherwise target nearly any molecule of interest, including apulmonary disease associated protein. Methods of constructing anddetermining the binding characteristics of aptamers are well known inthe art. For example, such techniques are described in U.S. Pat. Nos.5,582,981, 5,595,877 and 5,637,459. Aptamers are typically at least 5nucleotides, 10, 20, 30 or 40 nucleotides in length, and can be composedof modified nucleic acids to improve stability. Flanking sequences canbe added for structural stability, e.g., to form 3-dimensionalstructures in the aptamer.

Protein detection agents described herein can also be used as atreatment and/or diagnosis of pulmonary disease or predictor of diseaseprogression, e.g., propensity for survival, in a subject having orsuspected of developing a pulmonary disorder. In certain embodiments,MUC5B antibodies can be used to assess MUC5B protein levels in a subjecthaving or suspected of developing a pulmonary disorder. It iscontemplated herein that antibodies or antibody fragments may be used tomodulate MUC5B production in a subject having or suspected of developinga pulmonary disease. In certain embodiments, one or more agents capableof modulating MUC5B may be used to treat a subject having or suspectedof developing a pulmonary disorder. One or more antibodies or antibodyfragments may be generated to detect one or more of the SNPs disclosedherein by any method known in the art.

In certain embodiments, MUC5B diagnostic tests may include, but are notlimited to, alone or in combination, analysis of rs35705950 SNP in MUC5Bgene, MUC5B mRNA levels, and/or MUC5B protein levels.

D. Additional Pulmonary Disease Markers

The above methods of detection can be applied to additional pulmonarydisease markers. That is, the expression level or presence of geneticvariants of at least one additional pulmonary disease marker gene can bedetermined, or the activity of the marker protein can be determined, andcompared to a standard control for the pulmonary disease marker. Theexamination of additional pulmonary disease markers can be used toconfirm a diagnosis of pulmonary disease, monitor disease progression,or determine the efficacy of a course of treatment in a subject.

In some cases, pulmonary disease is indicated by an increased number oflymphocytes, e.g., CD4+CD28− cells (Moeller et al. (2009) Am. J. Resp.Crit Care. Med. 179:588; Gilani (2010) PLoS One 5:e8959).

Genetic variations in the following genes are associated with pulmonarydisease: Surfactant Protein A2, Surfactant Protein B, Surfactant ProteinC, TERC, TERT, IL-1RN, IL-1α, IL-1β, TNF, Lymphotoxin α, TNF-RII, IL-10,IL-6, IL-12, IFNγ, TGFβ, CR1, ACE, IL-8, CXCR1, CXCR2, MUC1 (KL6), orMUC5AC. Thus, the invention further includes methods of determiningwhether the genome of a subject comprises a genetic variant of at leastone gene selected from these genes. The presence of a genetic variantindicates that the subject has or is at risk of developing pulmonarydisease. Said determining can optionally be combined with determiningwhether the genome of the subject comprises a genetic variant MUC5Bgene, or determining whether the subject has an elevated level of MUC5BRNA or protein to confirm or strengthen the diagnosis or prognosis.

Abnormal expression in the following genes can also be indicative ofpulmonary disease: Surfactant Protein A, Surfactant Protein D,KL-6/MUC1, CC16, CK-19, Ca 19-9, SLX, MCP-1, MIP-1a, ITAC, glutathione,type III procollagen peptide, sIL-2R, ACE, neopterin,beta-glucuronidase, LDH, CCL-18, CCL-2, CXCL12, MMP7, and osteopontin.Thus, the expression of one of these genes can be detected and comparedto a control, wherein an abnormal expression level indicates that thesubject has or is at risk of developing pulmonary disease. Saiddetermining can optionally be combined with determining whether thegenome of the subject comprises a genetic variant MUC5B gene, ordetermining whether the subject has an elevated level of MUC5B RNA orprotein to confirm or strengthen the diagnosis or prognosis.

E. Indications

The detection methods described herein can be used for diagnosis,prognosis, risk prediction, determining a course of treatment,monitoring therapeutic efficacy, and monitoring disease progression. Oneof skill will appreciate that each of the detection methods can be usedalone or in combination.

For example, the presence of a genetic variant MUC5B gene can bedetermined in a subject suspected of having or at risk of developing apulmonary disorder. In the event that a genetic variant MUC5B gene isobserved, the subject can optionally undergo further testing, e.g., todetermine the level of MUC5B gene expression, or detect a geneticvariant form of at least one additional pulmonary disease marker. Thesubject can be prescribed a course of treatment based on the results ofone or more tests. Such treatment can include administration of a MUC5Bantagonist, or a standard pulmonary disease treatment such as amucolytic drug. The expression level of the MUC5B gene can be detectedagain after treatment, or periodically during the course of treatment,to determine the therapeutic efficacy of the treatment. For example, ifa pulmonary disease treatment is prescribed for periodic administration(e.g., daily, twice-daily, weekly, etc.), the MUC5B gene expressionlevel can be monitored periodically thereafter (e.g., monthly).

The detection methods of the invention can be used to determine if thesubject has an attenuated form of the pulmonary disease. The inventorshave shown that individuals carrying the rs35705950 genetic variantMUC5B gene have a better pulmonary disease prognosis than individualsthat do not carry a genetic variant MUC5B gene. Thus, determination ofwhether an individual carries the genetic variant MUC5B gene can be usedto design a course of treatment for the individual.

V. METHODS OF TREATMENT

A. Pulmonary Disease Treatments

A number of pulmonary disease treatments are available for addressingairway inflammation and/or excess mucus secretion. These include agentsthat can be roughly categorized, e.g., as mucolytic agents,mucoregulatory agents, mucokinetic agents, and expectorants (see, e.g.,Balsamo et al. (2010) Eur. Respir. Rev. 19:127-33), though there is someoverlap in the categories. Such agents are useful for treating thepulmonary diseases described herein, e.g., as part of a course oftreatment and monitoring, or after detection of elevated MUC5B RNA orprotein, or detection of a genetic variant MUC5B gene.

Mucolytic drugs are those that decrease mucus viscosity, either bydepolymerizing mucin glycoproteins or depolymerizing DNA and F-actinpolymer networks. The first mode of action can be particularly usefulfor addressing excess MUC5B. Exemplary mucolytics includeN-acetylcysteine, N-acystelyn, erdoseine, dornase alfa, thymosin beta4,dextran, pulmozyme, heparin, and bronchiotol (inhaled mannose).

Mucoregulators are those agents that regulate mucus secretion, orinterfere with the DNA/F-actin network. Examples of mucoregulatorsinclude, e.g., carbocysteine, anticholoinergic agents, glucocorticoids,and macrolide antibiotics.

Mucokinetic agents increase mucus clearance by acting on the cilialining the airway. Examplary mucokinetic agents include, e.g.,bronchodilators, surfactants, and ambroxol.

Expectorants are agents that induce discharge of mucus from the airwayor respiratory tract. Some examples include hypertonic saline,guaifenesin, dornase/pulmozyme, and bronchiotol (inhaled mannose).

The pulmonary disease treatment, such as the agents described above, canbe used alone, sequentially, or in combination according to the methodsdescribed herein. In some embodiments, a pulmonary disease treatment isused in combination with a more targeted inhibitor of MUC5B expression.

B. MUC5B Antagonists

The results disclosed herein indicate that elevated expression of theMUC5B gene is associated with pulmonary disease. The invention thusincludes methods and compositions for inhibiting the expression,secretion, and/or activity of MUC5B. Exemplary inhibitors include siRNAand antisense, pRNA (promoter-associated RNA, see, e.g., Schmitz et al.(2010) Genes Dev. 24:2264-69), MUC5B-specific antibodies and fragmentsthereof, and MUC5B-specific aptamers. In some embodiments, MUC5Bactivity can be inhibited or MUC5B clearance can be increased, e.g.,using mucolytic agents, glycosylation inhibitors, or inhibitors ofprotein secretion. The terms “inhibitor” and “antagonist” and like termsare used synonymously herein.

Thus, a nucleotide sequence that specifically interferes with expressionof the MUC5B gene at the transcriptional or translational level can beused to treat or prevent pulmonary disease. This approach may utilize,for example, siRNA and/or antisense oligonucleotides to blocktranscription or translation of a specific mRNA (e.g., a genetic variantRNA), either by inducing degradation of the mRNA with a siRNA or bymasking the mRNA with an antisense nucleic acid. In some embodiments,the siRNA or antisense construct does not significantly block expressionof other mucin genes.

Double stranded siRNA that corresponds to the MUC5B gene can be used tosilence the transcription and/or translation by inducing degradation ofMUC5B mRNA transcripts, and thus treat or prevent pulmonary disease(e.g., pulmonary disease associated with genetic variant MUC5B). ThesiRNA is typically about 5 to about 100 nucleotides in length, moretypically about 10 to about 50 nucleotides in length, most typicallyabout 15 to about 30 nucleotides in length. siRNA molecules and methodsof generating them are described in, e.g., Bass, 2001, Nature, 411,428-429; Elbashir et al., 2001, Nature, 411, 494-498; WO 00/44895; WO01/36646; WO 99/32619; WO 00/01846; WO 01/29058; WO 99/07409; and WO00/44914. A DNA molecule that transcribes dsRNA or siRNA (for instance,as a hairpin duplex) also provides RNAi. DNA molecules for transcribingdsRNA are disclosed in U.S. Pat. No. 6,573,099, and in U.S. PatentApplication Publication Nos. 2002/0160393 and 2003/0027783, and Tuschland Borkhardt, Molecular Interventions, 2:158 (2002). For example, dsRNAoligonucleotides that specifically hybridize to the MUC5B nucleic acidsequences described herein can be used in the methods of the presentinvention. A decrease in the severity of pulmonary disease symptoms incomparison to symptoms detected in the absence of the interfering RNAcan be used to monitor the efficacy of the siRNA

Antisense oligonucleotides that specifically hybridize to nucleic acidsequences encoding MUC5B polypeptides can also be used to silencetranscription and/or translation, and thus treat or prevent pulmonarydisease. For example, antisense oligonucleotides that specificallyhybridize to a MUC5B polynucleotide sequence can be used. A decrease inthe severity of pulmonary disease symptoms in comparison to symptomsdetected in the absence of the antisense nucleic acids can be used tomonitor the efficacy of the antisense nucleic acids.

Antisense nucleic acids are DNA or RNA molecules that are complementaryto at least a portion of a specific mRNA molecule (see, e.g., Weintraub,Scientific American, 262:40 (1990)). Typically, synthetic antisenseoligonucleotides are generally between 15 and 25 bases in length.Antisense nucleic acids may comprise naturally occurring nucleotides ormodified nucleotides such as, e.g., phosphorothioate, methylphosphonate,and -anomeric sugar-phosphate, backbone-modified nucleotides.

In the cell, the antisense nucleic acids hybridize to the correspondingmRNA, forming a double-stranded molecule. The antisense nucleic acids,interfere with the translation of the mRNA, since the cell will nottranslate a mRNA that is double-stranded. Antisense oligomers of about15 nucleotides are preferred, since they are easily synthesized and areless likely to cause problems than larger molecules when introduced intothe target nucleotide mutant producing cell. The use of antisensemethods to inhibit the in vitro translation of genes is well known inthe art (Marcus-Sakura, Anal. Biochem., 172:289, (1988)). Less commonly,antisense molecules which bind directly to the DNA may be used.

siRNA and antisense can be delivered to the subject using any meansknown in the art, including by injection, inhalation, or oral ingestion.Another suitable delivery system is a colloidal dispersion system suchas, for example, macromolecule complexes, nanocapsules, microspheres,beads, and lipid-based systems including oil-in-water emulsions,micelles, mixed micelles, and liposomes. The preferred colloidal systemof this invention is a liposome. Liposomes are artificial membranevesicles which are useful as delivery vehicles in vitro and in vivo.Nucleic acids, including RNA and DNA within liposomes and be deliveredto cells in a biologically active form (Fraley, et al., Trends Biochem.Sci., 6:77, 1981). Liposomes can be targeted to specific cell types ortissues using any means known in the art.

The invention also provides antibodies that specifically bind to MUC5Bprotein. Such antibodies can be used to sequester secreted MUC5B, e.g.,to prevent gel-forming activity and formation of excess mucus.

An antibody that specifically detects MUC5B, and not other mucinproteins, can be isolated using standard techniques described herein.The protein sequences for MUC5B in a number of species, e.g., humans,non-human primates, rats, dogs, cats, horses, bovines, etc., arepublically available.

Monoclonal antibodies are obtained by various techniques familiar tothose skilled in the art. Briefly, spleen cells from an animal immunizedwith a desired antigen are immortalized, commonly by fusion with amyeloma cell (see, for example, Kohler & Milstein, Eur. J. Immunol. 6:511-519 (1976)). Alternative methods of immortalization includetransformation with Epstein Barr Virus, oncogenes, or retroviruses, orother methods well known in the art. Colonies arising from singleimmortalized cells are screened for production of antibodies of thedesired specificity and affinity for the antigen, and yield of themonoclonal antibodies produced by such cells may be enhanced by varioustechniques, including injection into the peritoneal cavity of avertebrate host. Alternatively, one may isolate DNA sequences whichencode a monoclonal antibody or a binding fragment thereof by screeninga DNA library from human B cells according to the general protocoloutlined by Huse et al., Science 246: 1275-1281 (1989).

Monoclonal antibodies are collected and titered against the MUC5B in animmunoassay, for example, a solid phase immunoassay with the immunogenimmobilized on a solid support. Monoclonal antibodies will usually bindwith a K_(d) of at least about 0.1 mM, more usually at least about 1 μM,and can often be designed to bind with a K_(d) of 1 nM or less.

The immunoglobulins, including MUC5B-binding fragments and derivativesthereof, can be produced readily by a variety of recombinant DNAtechniques, including by expression in transfected cells (e.g.,immortalized eukaryotic cells, such as myeloma or hybridoma cells) or inmice, rats, rabbits, or other vertebrate capable of producing antibodiesby well known methods. Suitable source cells for the DNA sequences andhost cells for immunoglobulin expression and secretion can be obtainedfrom a number of sources, such as the American Type Culture Collection(Catalogue of Cell Lines and Hybridomas, Fifth edition (1985) Rockville,Md.).

In some embodiments, the antibody is a humanized antibody, i.e., anantibody that retains the reactivity of a non-human antibody while beingless immunogenic in humans. This can be achieved, for instance, byretaining the non-human CDR regions that are specific for MUC5B, andreplacing the remaining parts of the antibody with their humancounterparts. See, e.g., Morrison et al., PNAS USA, 81:6851-6855 (1984);Morrison and Oi, Adv. Immunol., 44:65-92 (1988); Verhoeyen et al.,Science, 239:1534-1536 (1988); Padlan, Molec. Immun., 28:489-498 (1991);Padlan, Molec. Immun., 31(3):169-217 (1994). Techniques for humanizingantibodies are well known in the art and are described in e.g., U.S.Pat. Nos. 4,816,567; 5,530,101; 5,859,205; 5,585,089; 5,693,761;5,693,762; 5,777,085; 6,180,370; 6,210,671; and 6,329,511; WO 87/02671;EP Patent Application 0173494; Jones et al. (1986) Nature 321:522; andVerhoyen et al. (1988) Science 239:1534. Humanized antibodies arefurther described in, e.g., Winter and Milstein (1991) Nature 349:293.For example, polynucleotides comprising a first sequence coding forhumanized immunoglobulin framework regions and a second sequence setcoding for the desired immunoglobulin complementarity determiningregions can be produced synthetically or by combining appropriate cDNAand genomic DNA segments. Human constant region DNA sequences can beisolated in accordance with well known procedures from a variety ofhuman cells.

The activity of MUC5B protein can be inhibited, or the clearance ofMUC5B can be increased, using mucolytic agents that break up mucus andproteolyze mucins. Mucolytic agents are described herein. Additionalinhibitors of MUC5B protein include glycosylation inhibitors andinhibitors of protein secretion from epithelial cells. An exemplaryglycosylation inhibitor includes benzyl-O—N-acetyl-D galactosamine(specific for O-glycans) and. Additional inhibitors of proteinglycosylation are disclosed, e.g., in Jacob (1995) Curr. Opin.Structural Biol. 5:605-11 and Patsos et al. 2005 Biochem Soc. Trans.33:721-23. Secretion inhibitors include Brefeldin A, colchicine, andsmall molecules such as that disclosed in Stockwell (2006) Nat. Chem.Biol. 2:7-8. MUC5B activity can also be modulated by targeting theMARCKS protein (Adler et al. (2000) Chest 117: Supp 1 266S-267S).

C. Methods of Identifying MUC5B Antagonists

The invention further provides methods for identifying additionalantagonists of MUC5B expression, secretion, and/or activity. Methods forscreening for antagonists can involve measuring the ability of thepotential antagonists to reduce an identifiable MUC5B activity orcompete for binding with a known binding agent (e.g., MUC5B-specificantibody). For example, candidate agents can be screened for theirability to reduce MUC5B gel formation, reduce MUC5B secretion, reduceMUC5B glycosylation, etc.

The screening methods of the invention can be performed as in vitro orcell-based assays. Cell based assays can be performed in any cells inwhich MUC5B is expressed, either endogenously or through recombinantmethods. Cell-based assays may involve whole cells or cell fractionscontaining MUC5B to screen for agent binding or modulation of MUC5Bactivity by the agent. Suitable cell-based assays are described in,e.g., DePaola et al., Annals of Biomedical Engineering 29: 1-9 (2001).

Agents that are initially identified as inhibiting MUC5B can be furthertested to validate the apparent activity. Preferably such studies areconducted with suitable cell-based or animal models of pulmonarydisease. The basic format of such methods involves administering a leadcompound identified during an initial screen to an animal that serves asa model and then determining if in fact the pulmonary disease isameliorated. The animal models utilized in validation studies generallyare mammals of any kind. Specific examples of suitable animals include,but are not limited to, primates (e.g., chimpanzees, monkeys, and thelike) and rodents (e.g., mice, rats, guinea pigs, rabbits, and thelike).

The agents tested as potential antagonists of MUC5B can be any smallchemical compound, or a biological entity, such as a polypeptide, sugar,nucleic acid or lipid. Alternatively, modulators can be geneticallyaltered versions of MUC5B, e.g., forms that are not glycosylated.Essentially any chemical compound can be used as a potential modulatoror ligand in the assays of the invention, although most often compoundsthat can be dissolved in aqueous or organic (especially DMSO-based)solutions are used. The assays are designed to screen large chemicallibraries by automating the assay steps and providing compounds from anyconvenient source to assays, which are typically run in parallel (e.g.,in microtiter formats on microtiter plates in robotic assays).

In one embodiment, high throughput screening methods involve providing acombinatorial chemical or peptide library containing a large number ofpotential therapeutic compounds (potential modulator or ligandcompounds). Such “combinatorial chemical libraries” or “ligandlibraries” are then screened in one or more assays, as described herein,to identify those library members (particular chemical species orsubclasses) that display a desired characteristic activity. Thecompounds thus identified can serve as conventional “lead compounds” orcan themselves be used as potential or actual therapeutics.

A combinatorial chemical library is a collection of diverse chemicalcompounds generated by either chemical synthesis or biologicalsynthesis, by combining a number of chemical “building blocks” such asreagents. For example, a linear combinatorial chemical library such as apolypeptide library is formed by combining a set of chemical buildingblocks (amino acids) in every possible way for a given compound length(i.e., the number of amino acids in a polypeptide compound). Millions ofchemical compounds can be synthesized through such combinatorial mixingof chemical building blocks.

Preparation and screening of combinatorial chemical libraries is wellknown to those of skill in the art. Such combinatorial chemicallibraries include, but are not limited to, peptide libraries (see, e.g.,U.S. Pat. No. 5,010,175, Furka, Int. J. Pept. Prot. Res. 37:487-493(1991) and Houghton et al., Nature 354:84-88 (1991)). Other chemistriesfor generating chemical diversity libraries can also be used. Suchchemistries include, but are not limited to: peptoids (e.g., PCTPublication No. WO 91/19735), encoded peptides (e.g., PCT Publication WO93/20242), random bio-oligomers (e.g., PCT Publication No. WO 92/00091),benzodiazepines (e.g., U.S. Pat. No. 5,288,514), diversomers such ashydantoins, benzodiazepines and dipeptides (Hobbs et al., Proc. Nat.Acad. Sci. USA 90:6909-6913 (1993)), vinylogous polypeptides (Hagiharaet al., J. Amer. Chem. Soc. 114:6568 (1992)), nonpeptidalpeptidomimetics with glucose scaffolding (Hirschmann et al., J. Amer.Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses of smallcompound libraries (Chen et al., J. Amer. Chem. Soc. 116:2661 (1994)),oligocarbamates (Cho et al., Science 261:1303 (1993)), and/or peptidylphosphonates (Campbell et al., J. Org. Chem. 59:658 (1994)), nucleicacid libraries (see Ausubel, Berger and Sambrook, all supra), peptidenucleic acid libraries (see, e.g., U.S. Pat. No. 5,539,083), antibodylibraries (see, e.g., Vaughn et al., Nature Biotechnology, 14(3):309-314(1996) and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang etal., Science, 274:1520-1522 (1996) and U.S. Pat. No. 5,593,853), smallorganic molecule libraries (see, e.g., benzodiazepines, Baum C&EN,January 18, page 33 (1993); isoprenoids, U.S. Pat. No. 5,569,588;thiazolidinones and metathiazanones, U.S. Pat. No. 5,549,974;pyrrolidines, U.S. Pat. Nos. 5,525,735 and 5,519,134; morpholinocompounds, U.S. Pat. No. 5,506,337; benzodiazepines, and U.S. Pat. No.5,288,514).

D. Pharmaceutical Compositions

The compositions disclosed herein can be administered by any means knownin the art. For example, compositions may include administration to asubject intravenously, intradermally, intraarterially,intraperitoneally, intralesionally, intracranially, intraarticularly,intraprostaticaly, intrapleurally, intratracheally, intranasally,intravitreally, intravaginally, intrarectally, topically,intratumorally, intramuscularly, intrathecally, subcutaneously,subconjunctival, intravesicularlly, mucosally, intrapericardially,intraumbilically, intraocularly, orally, locally, by inhalation, byinjection, by infusion, by continuous infusion, by localized perfusion,via a catheter, via a lavage, in a creme, or in a lipid composition.Administration can be local, e.g., to the pulmonary mucosa, or systemic.

Solutions of the active compounds as free base or pharmacologicallyacceptable salt can be prepared in water suitably mixed with asurfactant, such as hydroxypropylcellulose. Dispersions can also beprepared in glycerol, liquid polyethylene glycols, and mixtures thereofand in oils. Under ordinary conditions of storage and use, thesepreparations can contain a preservative to prevent the growth ofmicroorganisms.

Pharmaceutical compositions can be delivered via intranasal or inhalablesolutions or sprays, aerosols or inhalants. Nasal solutions can beaqueous solutions designed to be administered to the nasal passages indrops or sprays. Nasal solutions can be prepared so that they aresimilar in many respects to nasal secretions. Thus, the aqueous nasalsolutions usually are isotonic and slightly buffered to maintain a pH of5.5 to 6.5. In addition, antimicrobial preservatives, similar to thoseused in ophthalmic preparations, and appropriate drug stabilizers, ifrequired, may be included in the formulation. Various commercial nasalpreparations are known and can include, for example, antibiotics andantihistamines.

Oral formulations can include excipients as, for example, pharmaceuticalgrades of mannitol, lactose, starch, magnesium stearate, sodiumsaccharine, cellulose, magnesium carbonate and the like. Thesecompositions take the form of solutions, suspensions, tablets, pills,capsules, sustained release formulations or powders. In someembodiments, oral pharmaceutical compositions will comprise an inertdiluent or assimilable edible carrier, or they may be enclosed in hardor soft shell gelatin capsule, or they may be compressed into tablets,or they may be incorporated directly with the food of the diet. For oraltherapeutic administration, the active compounds may be incorporatedwith excipients and used in the form of ingestible tablets, buccaltablets, troches, capsules, elixirs, suspensions, syrups, wafers, andthe like. Such compositions and preparations should contain at least0.1% of active compound. The percentage of the compositions andpreparations may, of course, be varied and may conveniently be betweenabout 2 to about 75% of the weight of the unit, or preferably between25-60%. The amount of active compounds in such compositions is such thata suitable dosage can be obtained

For parenteral administration in an aqueous solution, for example, thesolution should be suitably buffered and the liquid diluent firstrendered isotonic with sufficient saline or glucose. Aqueous solutions,in particular, sterile aqueous media, are especially suitable forintravenous, intramuscular, subcutaneous and intraperitonealadministration. For example, one dosage could be dissolved in 1 ml ofisotonic NaCl solution and either added to 1000 ml of hypodermoclysisfluid or injected at the proposed site of infusion

Sterile injectable solutions can be prepared by incorporating the activecompounds or constructs in the required amount in the appropriatesolvent followed by filtered sterilization. Generally, dispersions areprepared by incorporating the various sterilized active ingredients intoa sterile vehicle which contains the basic dispersion medium.Vacuum-drying and freeze-drying techniques, which yield a powder of theactive ingredient plus any additional desired ingredients, can be usedto prepare sterile powders for reconstitution of sterile injectablesolutions. The preparation of more, or highly, concentrated solutionsfor direct injection is also contemplated. DMSO can be used as solventfor extremely rapid penetration, delivering high concentrations of theactive agents to a small area.

E. Treatment Regimes

The invention provides methods of treating, preventing, and/orameliorating a pulmonary disorder in a subject in need thereof,optionally based on the diagnostic and predictive methods describedherein. The course of treatment is best determined on an individualbasis depending on the particular characteristics of the subject and thetype of treatment selected. The treatment, such as those disclosedherein, can be administered to the subject on a daily, twice daily,bi-weekly, monthly or any applicable basis that is therapeuticallyeffective. The treatment can be administered alone or in combinationwith any other treatment disclosed herein or known in the art. Theadditional treatment can be administered simultaneously with the firsttreatment, at a different time, or on an entirely different therapeuticschedule (e.g., the first treatment can be daily, while the additionaltreatment is weekly).

Administration of a composition for ameliorating the pulmonary disease,e.g., by treating elevated expression of the MUC5B gene, can be asystemic or localized administration. For example, treating a subjecthaving a pulmonary disorder can include administering an inhalable orintranasal form of anti-MUC5B agent (MUC5B antagonist) on a daily basisor otherwise regular schedule. In some embodiments, the treatment isonly on an as-needed basis, e.g., upon appearance of pulmonary diseasesymptoms.

VI. KITS

The invention provides kits for detection of pulmonary disease markersin a subject. The kit can be for personal use or provided to medicalprofessionals. The kit can be a kit for diagnosing or prognosing apulmonary disorder, or for monitoring the progression of disease or theefficacy of treatment.

In some embodiments, the kit includes components for assessing MUC5Bgene expression comprising, e.g., a nucleic acid capable of detectingMUC5B RNA or a MUC5B protein binding agent, optionally labeled. One ofskill will appreciate that MUC5B gene expression can be determined bymeasuring MUC5B RNA or protein. The kit can further include assaycontainers (tubes), buffers, or enzymes necessary for carrying out thedetection assay.

In some embodiments, the kit includes components for determining whetherthe genome of the subject carries a genetic variant MUC5B gene, e.g., anucleic acid that specifically hybridizes to a genetic variant MUC5Bgene sequence. Other components in a kit can include, DNA sequencingassay components, Taqman® genotyping assay components, Meta Analysis,one or more detection system(s), one or more control samples or acombination thereof. Kits can further include one or more agents whereat least one of the agents is capable of associating with SNPrs35705950.

In some embodiments, the kit includes components to examine more thanone pulmonary disease marker. For example, the kit can include markerdetection agents, such as marker specific primers or probes attached toan addressable array. Exemplary markers include SNPs in the MUC5B genes,or genetic variants in other genes, e.g., Surfactant Protein A2,Surfactant Protein B, Surfactant Protein C, TERC, TERT, IL-1RN, IL-1α,IL-1β, TNF, Lymphotoxin α, TNF-RII, IL-10, IL-6, IL-12, IFNγ, TGFβ, CR1,ACE, IL-8, CXCR1 or CXCR2. In some embodiments, the expression level ofthe markers is detected instead of or in addition to the geneticsequence. In this case, useful pulmonary disease markers with aberrantexpression include: Surfactant Protein A, Surfactant Protein D,KL-6/MUC1, CC16, CK-19, Ca 19-9, SLX, MCP-1, MIP-1a, ITAC, glutathione,type III procollagen peptide, sIL-2R, ACE, neopterin,beta-glucuronidase, LDH, CCL-18, CCL-2, CXCL12, MMP7, and osteopontin.Additional pulmonary disease markers can include the other MUC genes,e.g., MUC2, MUC5AC, and MUC6.

The kit will generally include at least one vial, test tube, flask,bottle, syringe or other container means, into which the testing agent,can be suitably reacted or aliquoted. Kits can also include componentsfor comparing results such as a suitable control sample, for example apositive and/or negative control. The kit can also include a collectiondevice for collecting and/or holding the sample from the subject. Thecollection device can include a sterile swab or needle (for collectingblood), and/or a sterile tube (e.g., for holding the swab or a bodilyfluid sample).

The following discussion of the invention is for the purposes ofillustration and description, and is not intended to limit the inventionto the form or forms disclosed herein.

Although the description of the invention has included description ofone or more embodiments and certain variations and modifications, othervariations and modifications are within the scope of the invention,e.g., as may be within the skill and knowledge of those in the art,after understanding the present disclosure. All publications, patents,patent applications, Genbank numbers, and websites cited herein arehereby incorporated by reference in their entireties for all purposes.

VII. EXAMPLES Example 1 Sequencing of Pulmonary, Gel-Forming Mucins andDisease Association

Study Populations:

Subjects with FIP or IPF were identified and phenotyped. The diagnosisof IIP was established according to conventional criteria. Eligiblesubjects were at least 38 years of age and had IIP symptoms for at least3 months. A high resolution computerized tomography (HRCT) scan wasrequired to show definite or probable IIP according to predefinedcriteria, and a surgical lung biopsy was obtained in 46% of affectedsubjects. FIP families were defined by the presence of two or more casesof definite or probable IIP within three degrees, with at least one caseof IIP established as definite/probable IPF. Exclusion criteria includedsignificant exposure to known fibrogenic agents or an alternativeetiology for ILD. Control subjects for genetic analysis were acquired(FIG. 1).

Linkage Analysis:

A genome-wide linkage screen was completed in 82 multiplex familiesusing a DeCode linkage panel consisting of a total of 884 markers withan average inter-marker distance of 4.2 CM. Multipoint non-parametriclinkage analysis was performed using Merlin, previously described. Kongand Cox LOD scores were calculated using the S_(pairs) statistic underan exponential model; support intervals were determined using theone-LOD-score-down method.

Fine-Mapping of Chromosome 11:

To interrogate the linked region on the p-terminus of chromosome 11 (8.4Mb bounded by rs702966 and rs1136966), fine mapping by genotyping 306tagging SNPs in 145 unrelated cases of FIP, 152 cases of IPF, and 233Caucasian controls were performed. Tests of association comparing FIPcases and IPF cases to controls were calculated under an additive modelfor the minor allele.

Resequencing of MUC2 and MUC5AC:

Primer pairs to generate overlapping amplicons for resequencing theproximal promoter and most exons of MUC2 and MUC5AC were designed onsequences masked for repetitive elements, SNPs, and homology to otherregions of the genome.

Genetic Screen of Lung-Expressed, Gel-Forming Mucins:

A case-control association study was conducted in an independentpopulation of FIP (N=83), sporadic IPF (N=492), and control (N=322)subjects (Table 2) using tagging and other SNPs localized across thelung-expressed, gel-forming mucin genes on chromosome 11. 175 SNPs weresuccessfully genotyped using the Sequenom iPlex assays, and haploviewwas used to test SNPs for allelic association with FIP and IPF. Forthose SNPs remaining significant after Bonferroni correction, oddsratios were estimated under an additive model for the rare allele afteradjustment for age and gender via logistic regression. Chi-squaredgoodness-of-fit tests were computed to evaluate the evidence fordisease-model explanations for genotypic departures from Hardy WeinbergEquilibrium (HWE) among cases. For the most highly-associated SNP,linkage and association modeling in pedigrees were used to test whether,in the original linkage families, the SNP was linked to the diseaselocus, was in linkage disequilibrium with the disease locus, and couldaccount for the linkage signal.

Strong evidence for linkage based on the 82 FIP families occurred onchromosome 11 where the maximum multipoint LOD score was 3.3 (p=0.00004,D11S1318; FIG. 3). The 1-LOD support interval for this linked region wasbounded by markers D11S4046 and D11S1760, spanning 3.4 Mb. SinceD1154046 was the most telomeric marker typed, the region of interest wasinclusive of the p-terminus of chromosome 11. Within the 8.4 Mb largerregion, 306 tagging SNPs were selected for fine-mapping in acase-control association analysis (145 FIP cases, 152 IPF cases, and 233controls. Allelic association testing revealed 7 SNPs within the mucin 2(MUC2) gene significantly associated with either FIP or IPF. MUC2 iscontained in a genomic region harboring 4 gel-forming mucin genes(telomere to centromere: MUC6, MUC2, MUC5AC, and MUC5B). While there arereported recombination hotspots located between MUC6 and MUC2, andwithin the proximal portion of MUC5B, markers within MUC2 and MUC5ACexhibit strong linkage disequilibrium (LD) 17. Thus, MUC2 and MUC5ACwere selected for resequencing using the oligonucleotide primers.Resequencing analysis identified 330 genetic variants in MUC2 and 195genetic variants in MUC5AC. Allelic association testing between thesegenetic variants and disease status yielded 7 independent SNPs in bothMUC2 and MUC5AC significantly associated with either FIP or IPF diseasestatus.

We designed a genetic screen for common genetic variation across thegenomic region containing the 3 gel-forming mucin genes expressed in thelung (MUC2, MUC5AC, and MUC5B) in an independent population of subjectswith IIP (FIP=83 and IPF=492) and controls (n=322) (FIG. 1, Table 2). 19independent SNPs were observed to be significantly associated by allelictest with either or both FIP or IPF after Bonferroni correction formultiple comparisons (Table 1). Of these 19 SNPs, 6 occurred in MUC2,one in the MUC2-MUC5AC intergenic region, 4 in MUC5AC, 3 in theMUC5AC-MUC5B intergenic region, and 5 in the putative MUC5B promoter,within 4 kb of the MUC5B transcription start site 18, 19 (Table 1).

Of significance, a SNP in the putative promoter of MUC5B, 3 kb upstreamof the transcription start site (rs35705950) was found to have the mostsubstantial effect on both FIP and IPF. The minor allele of this SNP waspresent at a frequency of 33.8% in FIP cases, 37.5% in IPF cases, and9.1% among controls (allelic association; FIP P=1.2×10-15, IPFP=2.5×10-37). Notably, the genotype frequencies for rs35705950 wereconsistent with HWE in controls, but not among IPF cases (P=6.0×10-11)and nearly so among FIP cases (P=0.11). By comparing the genotypefrequencies observed in cases and controls to those expected ifrs35705950 is a true risk locus, the data demonstrates that thesegenotype frequencies are consistent with an additive genotypic effect ondisease risk conferred by rs35705950 (P=0.88 and P=0.77, respectivelyfor FIP and IPF to reject additive effect). In addition, the diseaseallele frequency and penetrance estimates suggest a similar diseasemodel for both FIP and IPF. The odds ratio for disease for subjectsheterozygous and homozygous for the rarer allele of this SNP were 6.8(95% CI 3.9-12.0) and 20.8 (95% CI 3.8-113.7) for FIP, and 9.0 (95% CI6.2-13.1) and 21.8 (95% CI 5.1-93.5) for IPF (Table 1). To ensure thisSNP was not tagging another SNP in the MUC5B promoter region, the 4 kbregion was resequenced upstream of the MUC5B transcription start site in48 IPF cases and 48 controls (Table 3). It was observed that 34 geneticvariants but none had a pairwise r2 LD value with rs35705950 above 0.2(Table 4). Finally, among the original linkage families, rs35705950 wasfound to be both linked to (P=0.04) and in linkage disequilibrium with(P=1.5×10-9) the disease locus. While there is some evidence for otherlinked variants in the region (P=0.054), these results verify therelevance of this SNP to disease in these families.

TABLE 1 Genotypic association results assuming an additive model fromthe genetic screen of lung-expressed, gel-forming mucins in subjectswith IIP (FIP = 83 and IPF = 492) and controls (n = 322). NucleotideAmino Minor Allele Frequency Genotypic Association Test by Disease GroupAcid Mucin Hg19 FIP IPF Controls Odds Ratio Odds Ratio SNP Change RegionPosition (n = 83) (n = 492) (n = 322) (95% CI) P Value (95% CI) P Valuers10902081 C/T MUC2 Int7 1079809 37.2 38.6 47.9 0.6 (0.4-0.9) 0.011 0.7(0.5-0.8) 4.3 × 10⁻⁴ rs7127117* T/C MUC2 Int7 1079879 49.3 60 47.4 1.0(0.7-1.5) 0.826 1.6 (1.3-2.0) 6.9 × 10⁻⁵ rs41453346 C/T MUC2 Ex101080894 5 6.5 2.2 1.9 (0.8-4.3) 0.124 2.8 (1.6-5.2) 0.001 Tyr426Tyrrs41480348 G/A MUC2 Ex15 1082605 8.4 6.5 12.1 0.7 (0.4-1.2) 0.188 0.5(0.4-0.8) 0.001 Thr618Thr rs7934606* C/T MUC2 Int31 1093945 49.4 54 40.51.4 (1.0-2.0) 0.055 1.7 (1.4-2.2) 3.8 × 10⁻⁶ rs10902089* A/G MUC2 Int311094357 57.9 58.8 48.5 1.5 (1.0-2.1) 0.031 1.5 (1.2-1.9) 2.9 × 10⁻⁴rs9667239 C/T MUC2-5AC 1143101 22.5 21 12.5 2.2 (1.4-3.6) 0.001 1.9(1.4-2.7) 5.6 × 10⁻⁵ Intergenic rs55846509 G/A MUC5AC 1154294 3.1 5.51.6 1.7 (0.6-5.1) 0.316 3.6 (1.7-7.3) 0.001 Arg47Gln Ex2 rs28403537 C/TMUC5AC 1161315 8.9 13 3.4 2.7 (1.3-5.3) 0.006 4.6 (2.8-7.6) 3.2 × 10⁻⁹Ala497Val Ex12 MUC5AC- C/T MUC5AC  826476** 20.1 21 13.8 1.6 (1.0-2.5)0.053 1.6 (1.2-2.2) 0.003 025447* Int26 rs35288961 G/T MUC5AC 122046228.8 26.6 15.9 2.2 (1.4-3.5) 3.2 × 10⁻⁴ 2.0 (1.5-2.6) 3.7 × 10⁻⁶ Int46rs35671223 C/T MUC5AC- 1227069 42.6 42.4 33.4 1.4 (1.0-2.0) 0.05  1.5(1.2-1.9) 0.001 5B Intergenic rs28654232 C/T MUC5AC- 1229227 21.6 22.832.9 0.6 (0.4-0.9) 0.009 0.6 (0.5-0.8) 1.1 × 10⁻⁴ 5B Intergenicrs34595903* C/T MUC5AC- 1230393 21.5 23.3 34.8 0.5 (0.3-0.7) 0.001 0.5(0.4-0.7) 2.4 × 10⁻⁶ 5B Intergenic rs2672794 C/T MUC5B Prm 1241005 27.227.5 40.4 0.5 (0.3-0.8) 0.001 0.5 (0.4-0.7) 1.9 × 10⁻⁷ rs35705950 G/TMUC5B Prm 1241221 33.8 37.5 9.1 6.2 (3.7-10.4)  3.7 × 10⁻¹² 8.3(5.8-11.9)  4.6 × 10⁻³¹ rs35619543* G/T MUC5B Prm 1242250 40.3 39 23.82.4 (1.6-3.6) 3.3 × 10⁻⁵ 2.1 (1.6-2.8) 1.5 × 10⁻⁵ rs12804004 G/T MUC5BPrm 1242299 39.2 39.4 48.9 0.6 (0.4-0.9) 0.019 0.6 (0.5-0.8) 1.2 × 10⁻⁴rs868903* T/C MUC5B Prm 1242690 65.4 61 49.5 1.8 (1.3-2.6) 0.001 1.6(1.3-2.1) 2.8 × 10⁻⁵ *For these SNPs, DNA was available for 304controls. **Nucleotide position based on NW_001838016.1.

TABLE 2 Demographic characteristics of subjects in the re-sequencing andmucin genetic screen analyses. Re-Sequencing Genetic Screen ofLung-expressed Subjects Gel-forming Mucins Subjects FIP IPF Control FIPIPF Control Number of subjects 69 96 54 83 492  322* Male gender 41 6118 44 352 (71.5%) 147 (60%) (64%) (34%) (53.0%) (45.7%) Caucasian 68 8953 83 492 (100%)  322 (99%) (93%) (98%)  (100%)  (100%) Age at diagnosis66 ± 10 65 ± 8 68 ± 8 66.3 ± 11.2 67.2 ± 8.1 60.3 ± 12.6 Ever smoked 4471 25 46 342 (69.9%) 245 (64%) (74%) (47%) (56.8%) (76.6%) *325 controlsubjects were included in allelic association analyses but only 322 ingenotypic regression analyses as demographic variables needed forregression were missing for 3 subjects. Additionally, in some genotypingmultiplexes for the lung-expressed gel forming mucins, 18 of the 322controls were not screened due to lack of DNA availability

TABLE 3 Oligos used in resequencing of the MUC5B promoter MUC5B AmpliconPromoter Size Amplicon Forward Primer 5′ > 3′ Reverse Primer 5′ > 3′(bp) Hg19 Coordinates MUC5B- GGTTCTCCTTGTCTTGCAGCCCCTATGGGCTCTTGGTCTGCTCAGAG 616 Chr11: 1239997- Prim-1 (SEQ ID NO: 1)(SEQ ID NO: 2) 1240612 MUC5B- GGGCCTGGCTCTGAGTACACATCCTAAGGAAAGGGACACAGCCGGTTCC 644 Chr11: 1240556- Prim-2 (SEQ ID NO: 3)(SEQ ID NO: 4) 1241199 MUC5B- GGGTCCCCATTCATGGCAGGATTTTTCTCCATGGCAGAGCTGGGACC 601 Chr11: 1240957- Prim-3 (SEQ ID NO: 5)(SEQ ID NO: 6) 1241557 MUC5B- CTAGTGGGAGGGACGAGGGCAAAGTCTCGTGGCTGTGACTGCACCCAG 610 Chr11: 1241386- Prim-4 (SEQ ID NO: 7)(SEQ ID NO: 8) 1241995 MUC5B- TTGGCTAAGGTGGGAGACCT AGCTTGGGAATGTGAGAACG700 Chr11: 1241791- Prim-5 (SEQ ID NO: 9) (SEQ ID NO: 10) 1242490 MUC5B-CATGAGGGGTGACAGGTGGCAAA CCCGCGTTTGTCTTTCTGAAGTT 676 Chr11: 1242392-Prim-6 (SEQ ID NO: 11) (SEQ ID NO: 12) 1243067 MUC5B-GGTCAGAAGCTTTGAAGATGGGC CTTGTCCAATGCCAGCCCTGATC 607 Chr11: 1242985-Prim-7 (SEQ ID NO: 13) (SEQ ID NO: 14) 1243591 MUC5B-CTGCCAGGGTTAATGAGGAG GGATCAGGAAGGATTTGCAG 663 Chr11: 1243491- Prim-8(SEQ ID NO: 14) (SEQ ID NO: 16) 1244153 MUC5B- AGGCAGGCTGGCTGACCACTGTTTCGTGAAGACAGCATCGAGAGGGG 501 Chr11: 1243966- Prim-9 (SEQ ID NO: 17)(SEQ ID NO: 18) 1244466 MUC5B- TTGGCTAAGGTGGGAGACCT Chr11: 1241791-Prim-5 (SEQ ID NO: 19) 1241810 Seq Pr.

TABLE 4 SNPs identified in resequencing of the MUC5B promoter Hg19 BaseSEQ Position SNP Name

hang ID NO: Flanking Sequence

240338 rs2672792 T/C 20GTCACCTGCCCAGGTCCCCGAGGCC[T/C]GGAACACCTTCCTGCTGGGCCCACC

240485 rs72636989 G/A 21CCACCCCAGGAGTTGGGGGGCCCCCGT[G/A]CCAGGGAGCAGGAGGCTGCCGAGG

240925 Muc5B-Prm1 C/T 22GTGGCCCTGATCACTGGTGCCTGGA[C/T]GGCCTCTGAAGGGGTCTGTGGGGTC

241005 rs2672794 C/T 23AACCCCCCTCGGGTTCTGTGTGGTC [C/T]AGGCCGCCCCTTTGTCTCCACTGCC

241221 rs35705950 G/T 24TTTCTTCCTTTATCTTCTGTTTTCAGC[G/T]CCTTCAACTGTGAAGAAGTGA

241361 MUC5B-Prm2 A/G 25TGCCCCGGACCCAGCCCAGTTCCCA[A/G]TGGGCCCTCTGCCCGGGGAGGTGC

241762 MUC5B-Prm3 C/T 26GGTGGGCATCGGCTTGTGAGCTGGAGCCG[C/T]GGGCAGGGAGGGGGGATGTCACGAG

241821 rs11042491 G/A 27GGCTAAGGTGGGAGACCTGGGCGGGTGC[G/A]TCGGGGGGACGTCTGCAGCAGAGGC

241848 rs2735726 T/C 28TGCGTCGGGGGGACGTCTGCAGCAGAGGCC[T/C]GGGCAGCAGGCACACCCCTCCTGCCAG

241993 MUC5B-Prm4 G/A 29GGGGCCTGGGTGCAGTCACAGCCAC[G/A]AGCCCAGGGGTGGGGACTCTGGCC

242092 MUC5B-PrmS C/T 30CCCCTCCCACCGTGCCGTGCTGCAG[C/T]GGGTCTACCGGCCTGGATGTGAAA

242101 MUC5B-Prm6 C/T 31CCGTGCCGTGCTGCAGCGGGTCTAC[C/T]GGCCTGGATGTGAAAGAGAGCTTG

242227 rs11042646 C/T 32AGTCCCGGAAGTGAGCGGGGAGCTA[C/T]GCTGAGATCTGGGAGACCCCCTGC

242244 rs55974837 C/T 33GGGAGCTACGCTGAGATCTGGGAGA[C/T]CCCCTGCCCCCACCCAGGTACAGG

242250 rs35619543 G/T 34TACGCTGAGATCTGGGAGACCCCCT[G/T]CCCCCACCCAGGTACAGGGCCAGG

242299 rs12804004 T/G 35GCAGAAGCCCGAGGTGTGCCCTGAG[T/G]TAAAGAAACCGTCACAAAGAACAA

242472 rs56031419 G/A 36TGTCTCCGCCCTCCATCTCCAGAAC[G/A]TTCTCACATTCCCAAGCTGAAACC

242508 rs868902 C/A 37CCCAAGCTGAAACCCTGTCCCCATG[C/A]AACACCAGCTCACCATCCCCTCTGCC

242567 MUC5B-Prm7 C/T 38GGCGCCCACCGTCCACACTCCGTCT[C/T]TGCGGGTTTCATGACTCCAGGGGCAG

242599 MUC5B-Prm8 G/A 39TTTCATGACTCCAGGGGCAGCACAC[G/A]AGTGGCCCCTCCTGCCTTTGTCCTC

242607 MUC5B-Prm9 C/T 40CTCCAGGGGCAGCACACGAGTGGCC[C/T]CTCCTGCCTTTGTCCTCTGTGTCCA

242690 rs868903 C/T 41CCCCCATGGAGCAGCCTGGGCCAGCC[C/T]CTCCTTTTCACGGCTGAACCGTAT

242910 MUC5B-Prm10 G/A 42ACCCCCACCAGCAGGGCACAGGGCTCC[G/A]GGTCCCCACGTCTCTGCCAACACTT

242977 MUC5B-Prm11 G/A 43CTTGATCCCCGCCATCCTATTGAGC[G/A]TGAGACAGGTCAGAAGCTTTGAAG

243218 MUC5B-Prm12 G/A 44GTCTGCGCCACGGAGCATTCAGGAC[G/A]CTGGTGACCAGGGAGCCAGGAGGT

243378 rs885455 A/G 45CGTCAAGGAGGTTTACCACATAGCCCCC[A/G]GGAAGCCCACCCGACACCAGCCGGA

243391 rs885454 G/A 46TTTACCACATAGCCCCCRGGAAGCCCACCC[G/A]ACACCAGCCGGAGGTGCTAGGCTTC

243409 MUC5B-Prm13 T/C 47CCCACCCGACACCAGCCGGAGGTGC[T/C]AGGCTTCTGCGGCTCCCACCTGGG

243911 MUC5B-Prm14 G/A 48GGACCCATGGTCAGTGGCTGGGGGT[G/A]CTGCCCAGAGGCTGGGATTCCCTTC

244060 rs7115457 G/A 49GCCATCTAGGACGGGTGCCAGGTGG[G/A]GTAGGCCCTTCTCTCCCTTCCGATT

244080 rs7118568 C/G 50GGTGGGGTAGGCCCTTCTCTCCCTT[C/G]CGATTCTCAGAAGCTGCTGGGGGTG

244197 rs56235854 G/A 51AGCCCCTCCCCGAGAGCAAACACAC[G/A]TGGCTGGAGCGGGGAAGAGCATGGTGC

244219 rs2735738 T/C 52CACGTGGCTGGAGCGGGGAAGAGCA[T/C]GGTGCCCTGCGTGGCCTGGCCTGGC

244438 MUC5B-Prm15 C/T 53GCCGCAGGCAGGTAAGAGCCCCCCA[C/T]TCCGCCCCCTCTCGATGCTGTCTT

indicates data missing or illegible when filed

Next, the relationship between the rs35705950 SNP and the 18 other SNPssignificantly associated with IIP were analyzed. Testing pairwise LDbetween these SNPs by the r2 statistic, 10 of the 18 SNPs were found toexhibit low level LD (r2=0.15-0.27) with rs35705950 among IPF cases,suggesting the significance of these SNPs is due to LD with rs35705950(FIG. 3). Using genotypic logistic regression models to adjust forrs35705950 effects, we observed that the coefficients and correspondingP values were substantially reduced for all 18 SNPs which werepreviously associated with FIP and/or IPF (Table 5). After controllingfor rs35705950, only one SNP retained nominal significance for IPF(rs41480348, P=0.04). It was demonstrated that the significance of thers35705950 SNP was largely unaffected by adjustment for any of the 18SNPs tested (P value for all SNP models was less than 1.7×10-9 for FIPand 1.1×10-24 for IPF; Table 5). These results demonstrate a strongindependent effect of the rs3570590 SNP on both FIP and IPF.

TABLE 5 Genotypic logistic regression models for the 19 significant SNPsin the screen of lung-expressed gel-forming mucins alone, and afteradjusting for rs35705950, in patients with IPF or FIP. IPF Single SNPModel IPF rs35705950 FIP Single SNP Model FIP rs35705950 Odds Ratio OddsRatio Odds Ratio Odds Ratio Model # SNP (95% C.I) P Value (95% C.I) PValue (95% C.I) P Value (95% C.I) P Value 1 rs10902081 0.7 (0.5-0.8) 4.3× 10⁻⁴ 0.9 (0.7-1.2)  0.429 0.6 (0.4-0.9) 0.011 0.8 (0.5-1.2) 0.25 rs35705950 x x 8.3 (5.7-11.9) 1.5 × 10⁻²⁸ x x 5.9 (3.5-10.1) 6.6 × 10⁻¹¹2 rs7127117 1.6 (1.3-2.0) 6.9 × 10⁻⁵ 1.1 (0.8-1.4)  0.509 1.0 (0.7-1.5)0.826 0.7 (0.4-1.1) 0.094 rs35705950 x x 7.9 (5.4-11.6) 1.3 × 10⁻²⁵ x x6.3 (3.5-11.4) 7.3 × 10⁻¹⁰ 3 rs41453346 2.8 (1.6-5.2) 0.001 1.1(0.6-2.2) 0.72 1.9 (0.8-4.3) 0.124 1.2 (0.5-3.0) 0.653 rs35705950 x x8.1 (5.6-11.8) 2.7 × 10⁻²⁸ x x 6.1 (3.6-10.3) 1.2 × 10⁻¹¹ 4 rs414803480.5 (0.4-0.8) 0.001 0.6 (0.4-1.0) 0.04 0.7 (0.4-1.2) 0.188 0.9 (0.5-1.7)0.75  rs35705950 x x 7.9 (5.5-11.3) 2.1 × 10⁻²⁹ x x 6.1 (3.6-10.2) 1.0 ×10⁻¹¹ 5 rs7934606 1.7 (1.4-2.2) 3.8 × 10⁻⁶ 1.0 (0.7-1.3) 0.876 1.4(1.0-2.0) 0.055 0.9 (0.6-1.3) 0.473 rs35705950 x x 8.7 (5.8-12.9) 1.4 ×10⁻²⁶ x x 6.7 (3.8-11.9) 7.5 × 10⁻¹¹ 6 rs10902089 1.5 (1.2-1.9) 2.9 ×10⁻⁴ 0.9 (0.7-1.2) 0.69 1.5 (1.0-2.1) 0.031 1.0 (0.7-1.6) 0.813rs35705950 x x 8.3 (5.6-12.2) 1.3 × 10⁻²⁶ x x 6.1 (3.6-10.5) 6.2 × 10⁻¹¹7 rs9667239 1.9 (1.4-2.7) 5.6 × 10⁻⁵ 0.8 (0.5-1.2) 0.3 2.2 (1.4-3.6)0.001 1.1 (0.6-2.0) 0.668 rs35705950 x x 8.9 (6.0-13.3) 8.2 × 10⁻²⁷ x x5.8 (3.3-10.2) 6.0 × 10⁻¹⁰ 8 rs55846509 3.6 (1.7-7.3) 0.001 1.0(0.5-2.3) 0.96 1.7 (0.6-5.1) 0.32 0.8 (0.3-2.5) 0.706 rs35705950 x x 8.3(5.7-12.1) 2.7 × 10⁻²⁸ x x 6.4 (3.8-10.7) 4.8 × 10⁻¹² 9 rs28403537 4.6(2.8-7.6) 3.2 × 10⁻⁸ 1.5 (0.8-2.6) 0.2  2.7 (1.3-5.3) 0.006 0.8(0.3-1.8) 0.53  rs35705950 x x 7.6 (5.2-11.2) 1.1 × 10⁻²⁴ x x 6.7(3.8-11.8) 4.7 × 10⁻¹¹ 10 MUC5AC-025447 1.6 (1.2-2.2) 0.003 1.1(0.8-1.6) 0.49 1.6 (1.0-2.5) 0.053 1.4 (0.8-2.4) 0.19  RS35705950 x x7.7 (5.3-11.2) 3.1 × 10⁻²⁷ x x 6.0 (3.5-10.3) 4.7 × 10⁻¹¹ 11 rs352889612.0 (1.5-2.6) 3.7 × 10⁻⁶ 1.1 (0.8-1.5) 0.58 2.2 (1.4-3.5) 3.2 × 10⁻⁴ 1.3(0.7-2.1) 0.384 rs35705950 x x 7.9 (5.4-11.5) 6.6 × 10⁻²⁷ x x 5.7(3.3-10.0) 1.3 × 10⁻⁹ 12 rs35671223 1.5 (1.2-1.9) 0.001 0.9 (0.7-1.2)0.46 1.4 (1.0-2.0) 0.05  0.9 (0.6-1.4) 0.61  rs35705950 x x 8.5(5.8-12.4) 1.1 × 10⁻²⁸ x x 6.3 (3.6-10.9) 5.4 × 10⁻¹¹ 13 rs28654232 0.6(0.5-0.8) 1.1 × 10⁻⁴ 0.9 (0.7-1.1) 0.29 6.6 (0.4-0.9) 0.009 0.7(0.5-1.1) 0.167 rs35705950 x x 8.0 (5.5-11.5) 5.7 × 10⁻²⁹ x x 5.7(3.4-9.6) 5.8 × 10⁻¹¹ 14 rs34595903 0.5 (0.4-0.7) 2.4 × 10⁻⁶ 0.8(0.6-1.1)  0.116 0.5 (0.3-0.7) 0.001 0.6 (0.4-1.0) 0.041 rs35705950 x x7.4 (5.1-10.8) 7.0 × 10⁻²⁶ x x 5.1 (3.0-8.6) 1.7 × 10⁻⁹  15 rs26727940.5 (0.4-0.7) 1.9 × 10⁻⁷ 0.9 (0.7-1.2)  0.442 0.5 (0.3-0.8) 0.001 0.7(0.4-1.1) 0.152 rs35705950 x x 8.0 (5.5-11.6) 2.5 × 10⁻²⁷ x x 5.5(3.2-9.3) 3.2 × 10⁻¹⁰ 16 rs35619543 2.1 (1.6-2.8) 1.5 × 10⁻⁸ 1.3(0.9-1.7)  0.145 2.4 (1.6-3.6) 3.3 × 10−5 1.3 (0.8-2.1) 0.296 rs35705950x x 7.6 (5.2-11.2) 7.0 × 10⁻²⁵ x x 6.1 (3.4-10.9) 6.8 × 10⁻¹⁰ 17rs12804004 0.6 (0.5-0.8) 1.2 × 10⁻⁴ 0.8 (0.6-1.0) 0.07 0.6 (0.4-0.9)0.019 0.7 (0.5-1.1) 0.159 rs35705950 x x 7.9 (5.5-11.3) 6.4 × 10⁻²⁹ x x5.9 (3.5-10.0) 3.6 × 10⁻¹¹ 18 rs868903 1.6 (1.3-2.1) 2.8 × 10⁻⁵ 1.0(0.8-1.4)  0.753 1.8 (1.3-2.6) 0.001 1.4 (0.9-2.0) 0.145 rs35705950 x x7.8 (5.3-11.5) 8.6 × 10⁻²⁶ x x 5.6 (3.2-9.6) 4.4 × 10⁻¹⁰

Example 2 Single Nucleotide Polymorphism rs35705950 Results in IncreasedExpression of MUC5B Gene

The wildtype G allele of the rs35705950 SNP is conserved across primatespecies. The SNP is directly 5′ to a highly conserved region acrossvertebrate species, and is in the middle of sequence predicted to beinvolved in MUC5B gene regulation. A bioinformatic analysis of theeffect of the rs35705950 SNP predicts a disruption of an E2F bindingsite and creation of at least two new binding sites (e.g. HOX9 andPAX-2).

Based on these analyses, the effect of rs35705950 was examined on MUC5Bgene expression. In lung tissue from 33 subjects with IPF and 47unaffected subjects, quantitative RT-PCR revealed that MUC5B geneexpression was upregulated 14.1-fold among IPF subjects compared tounaffected subjects (P=0.0001, FIG. 4A). A 37.4-fold increase in MUC5Bexpression was observed among unaffected subjects carrying at least onecopy of the variant allele compared to homozygous wildtype subjects(P=0.0003, FIG. 4B). In contrast, no significant difference in MUC5Bgene expression was observed among the IPF subjects with at least onevariant allele of rs35705950 (FIG. 4C). Smoking, a potential confounderof MUC5B expression, appeared to have little effect on the associationbetween the rs35705950 variant allele and MUC5B expression among eitherunaffected or IPF affected subjects (FIGS. 4B and 4C).

MUC5B immunohistochemical staining in lung tissue showed cytoplasmicstaining in secretory columnar cells of the bronchi and larger proximalbronchioles (>200 μm) in IPF cases and controls (FIG. 5A). In subjectswith IPF, regions of dense accumulation of MUC5B were observed in areasof microscopic honeycombing and involved patchy staining of themetaplastic epithelia lining the honeycomb cysts (FIG. 5B), as well asthe mucous plugs within the cysts (FIG. 5C). No obvious differences wereobserved in MUC5B staining characteristics in IPF cases with the MUC5Bpromoter polymorphism.

IPF subjects have significantly more MUC5B lung gene expression thancontrols, and MUC5B protein is expressed in pathologic lesions of IPF.The present results show that the risk of developing FIP or IPF issubstantially correlated with the re35705950 promoter polymorphism,which causes increased MUC5B expression. In aggregate, the data showthat MUC5B expression in the lung plays a role in the pathogenesis ofpulmonary disease.

Based on the relationship between the SNP and excess production ofMUC5B, too much MUC5B can impair mucosal host defense to excessive lunginjury from inhaled substances, and, over time, lead to the developmentof IIP. In addition to the MUC5B promoter SNP, common exposures andbasic biological processes can influence either the expression orclearance of MUC5B. For instance, MUC5B expression can be enhanced inthe lung by cigarette smoke, acrolein, oxidative stress, IL-6, IL-8,IL-13, IL-17, 17β-estradiol, extracellular nucleotides, or epigeneticchanges that alter DNA methylation or chromatin structure. Moreover,clearance of lung mucus is dependent on effective ciliary motion,adequate hydration of the periciliary liquid layer, and an intact cough.Regardless of the cause, the present results indicate that excess MUC5Bcan compromise mucosal host defense, reducing lung clearance of inhaledparticles, dissolved chemicals, and microorganisms. Given the importanceof environmental exposures, such as asbestos, silica, and otherpollutants in the development of other forms of interstitial lungdisease, it is logical to speculate that common inhaled particles, suchas those associated with cigarette smoke or air pollution, might lead toor contribute to exaggerated interstitial injury in individuals who havedefects in mucosal host defense.

In addition, excess MUC5B in the respiratory bronchioles can interferewith alveolar repair. Alveolar injury can lead to collapse ofbronchoalveolar units and this focal lung injury is repaired throughre-epithelialization of the alveolus by type II alveolar epithelialcells. Thus, MUC5B can impede alveolar repair by either interfering withthe interaction between the type II alveolar epithelial cells and theunderlying matrix, or by interfering with the surface tension propertiesof surfactant. Either failure to re-epithelialize the basal lamina ofthe alveolus or suboptimal surfactant biology could enhance ongoingcollapse and fibrosis of adjacent bronchoalveolar units, and eventuallyresult in IIP.

Lesions of IPF are spatially heterogeneous, suggesting that IPF ismultifocal, originating in individual bronchoalveolar units. Since SNPrs35705950 occurs in the promoter region of MUC5B and is predicted todisrupt transcription factor binding sites, ectopic production of MUC5Bin cells or locations that cause injury to the bronchoalveolar unit canbe a causative agent. Unscreened genetic variants (especially in theinaccessible repetitive mucin regions) may be in linkage disequilibriumwith the MUC5B promoter SNP and affect the function of other lungmucins.

The present observations provide a novel clinical approach to pulmonarydisorders such as IIP. Invoking secreted airway mucins in thepathogenesis of pulmonary fibrosis suggests that the airspace plays arole in the pathogenesis of IIP. While the SNP (rs35705950) in the MUC5Bpromoter can be used to identify individuals at risk for developing IIP,the observation that mucin biology is be important in the etiology ofIIP reorients the focus of pathogenic and therapeutic studies ininterstitial lung disease to lung mucins and the airspace. Moreover, thegenetic causes of IIP (e.g., MUC5B, surfactant protein C, surfactantprotein A2, and the two telomerase genes TERC and TERT) provide insightinto the unique clinical manifestations of this complex disease process,and consequently, lead to earlier detection, more predictable prognosis,and personalized therapeutic strategies.

Example 3 Genetic Variant MUC5B Associated with Attenuated Form ofPulmonary Disease

The data described herein demonstrate that the genetic variant MUC5Brs35705950 is associated with development of pulmonary disease. We nextexamined whether rs35705950 genetic variant is associated with diseaseseverity and prognosis. We found that homozygous wildtype subjects (GG),i.e., those having normal MUC5B gene sequence, displayed a steeperdecline in forced vital capacity (FVC) over time as compared to subjectswith at least one T allele (P=0.0006). Essentially, while FVC declinesfor both groups, the decline is more gradual in those carrying the G→Tpolymorphism. For GG subjects, the FVC absolute value fell from about3.4 liters to about 3.1 liters over years 1-3 of the study. For GT andTT subjects, the FVC absolute value still fell, but started at over 3.5liters and fell to about 3.4 liters over years 1-3 of the study.

Additionally, we observed an association between death with subjectshaving at least one T allele having a lower mortality (OR(95%CI)=0.37(0.20-0.67); p=0.001) after adjusting for gender, history ofsmoking and DLCO (diffusion lung capacity for CO2). We also observed anassociation with time to death and the T allele; Hazard ratio (95%CI)=0.50(0.30-0.83) p=0.0069 after adjustment for gender, history ofsmoking and DLCO. These results suggest that in addition to being astrong risk factor for pulmonary disease development, the rs35705950 SNPcan indicate a less severe prognosis for the pulmonary disease.

1-86. (canceled)
 87. An in vitro complex comprising a nucleic acid probe hybridized to a nucleic acid comprising a genetic variant MUC5B gene sequence with a T allele at the rs35705950 single nucleotide polymorphism (SNP), wherein said nucleic acid is extracted from a human subject with a pulmonary fibrotic condition or is an amplification product of a nucleic acid extracted from a human subject with a pulmonary fibrotic condition.
 88. The complex of claim 87, wherein the nucleic acid probe is a labeled nucleic acid probe.
 89. The complex of claim 88, wherein the labeled nucleic probe is fluorescently labeled.
 90. The complex of claim 87, wherein the nucleic acid probe has at least 10 nucleotides.
 91. The complex of claim 87, wherein the nucleic acid probe comprises at least 10 contiguous nucleotides of the sequence of SEQ ID NO:24.
 92. The complex of claim 87, comprising more than one nucleic acid probe hybridized to the nucleic acid.
 93. The complex of claim 92, wherein each nucleic acid probe comprises at least 10 contiguous nucleotides of the sequence of SEQ ID NO:24.
 94. The complex of claim 87, wherein the human subject has a family history of idiopathic pulmonary fibrosis (IPF) or familial interstitial pneumonia (FIP).
 95. The complex of claim 87, further comprising a thermally stable polymerase. 